Quick Links

Re: Support UPDATE table SET(*)=...

Lists:	pgsql-hackers

From:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To:	Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Support UPDATE table SET(*)=...
Date:	2014-10-15 08:02:56
Message-ID:	CAOeZVif44gNx-uqE=qNm5FDu+LfSjK94jSm1goHMH2Fc3u1FpQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi All,

Please find attached a patch which implements support for UPDATE table1
SET(*)=...
The patch supports both UPDATE table SET(*)=(a,b,c) and UPDATE table1
SET(*)=(SELECT a,b,c FROM...).
It solves the problem of doing UPDATE from a record variable of the same
type as the table e.g. update foo set (*) = (select foorec.*) where ...;

The design is simple. It basically expands the * in transformation stage,
does the necessary type checking and adds it to the parse tree. This allows
for normal execution for the rest of the stages.

Thoughts/Comments?

Regards,

Atri

Attachment	Content-Type	Size
updatestar_ver1.patch	application/x-download	11.4 KB

From:	Marti Raudsepp <marti(at)juffo(dot)org>
To:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc:	Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-15 08:44:48
Message-ID:	CABRT9RBSKbJuWykJ=LcEvy8iL78WtT+bvbmJWnmTK_Dx-H1oeQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Oct 15, 2014 at 11:02 AM, Atri Sharma <atri(dot)jiit(at)gmail(dot)com> wrote:
> Please find attached a patch which implements support for UPDATE table1
> SET(*)=...

I presume you haven't read Tom Lane's proposal and discussion about
multiple column assignment in UPDATE:
http://www.postgresql.org/message-id/1783.1399054541@sss.pgh.pa.us
(Assigning all columns was also discussed there)

And there's a WIP patch:
http://www.postgresql.org/message-id/20930.1402931841@sss.pgh.pa.us

Regards,
Marti

From:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To:	Marti Raudsepp <marti(at)juffo(dot)org>
Cc:	Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-15 08:48:25
Message-ID:	CAOeZViff=+O79oOM9CVuwpvJGcfrJc5k=-Aq6ncVUR0zYStQMw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wednesday, October 15, 2014, Marti Raudsepp <marti(at)juffo(dot)org> wrote:

> Hi
>
> On Wed, Oct 15, 2014 at 11:02 AM, Atri Sharma <atri(dot)jiit(at)gmail(dot)com
> <javascript:;>> wrote:
> > Please find attached a patch which implements support for UPDATE table1
> > SET(*)=...
>
> I presume you haven't read Tom Lane's proposal and discussion about
> multiple column assignment in UPDATE:
> http://www.postgresql.org/message-id/1783.1399054541@sss.pgh.pa.us
> (Assigning all columns was also discussed there)
>
> And there's a WIP patch:
> http://www.postgresql.org/message-id/20930.1402931841@sss.pgh.pa.us
>
>
>
Thanks for the links, but this patch only targets SET(*) case, which, if I
understand correctly, the patch you mentioned doesn't directly handle (If I
understand correctly, the target of the two patches is different).

Regards,

Atri

--
Regards,

Atri
*l'apprenant*

From:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To:	Marti Raudsepp <marti(at)juffo(dot)org>
Cc:	Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-15 09:32:52
Message-ID:	CAOeZViePTEb8f7GV96DBNDsEGQV9VpDPexfh6w8yntvtQzvQYA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Oct 15, 2014 at 2:18 PM, Atri Sharma <atri(dot)jiit(at)gmail(dot)com> wrote:

>
> On Wednesday, October 15, 2014, Marti Raudsepp <marti(at)juffo(dot)org> wrote:
>
>> Hi
>>
>> On Wed, Oct 15, 2014 at 11:02 AM, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
>> wrote:
>> > Please find attached a patch which implements support for UPDATE table1
>> > SET(*)=...
>>
>> I presume you haven't read Tom Lane's proposal and discussion about
>> multiple column assignment in UPDATE:
>> http://www.postgresql.org/message-id/1783.1399054541@sss.pgh.pa.us
>> (Assigning all columns was also discussed there)
>>
>> And there's a WIP patch:
>> http://www.postgresql.org/message-id/20930.1402931841@sss.pgh.pa.us
>>
>>
>>
> Thanks for the links, but this patch only targets SET(*) case, which, if I
> understand correctly, the patch you mentioned doesn't directly handle (If I
> understand correctly, the target of the two patches is different).
>
>
Digging more, I figured that the patch I posted builds on top of Tom's
patch, since it did not add whole row cases.

Regards,

Atri

From:	Merlin Moncure <mmoncure(at)gmail(dot)com>
To:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc:	Marti Raudsepp <marti(at)juffo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-17 14:15:27
Message-ID:	CAHyXU0z1s_WHzpjDCiOTGgOYjgEa0a=vzQdNN7poyKtiMxxGig@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Oct 15, 2014 at 3:48 AM, Atri Sharma <atri(dot)jiit(at)gmail(dot)com> wrote:
>
>
> On Wednesday, October 15, 2014, Marti Raudsepp <marti(at)juffo(dot)org> wrote:
>>
>> Hi
>>
>> On Wed, Oct 15, 2014 at 11:02 AM, Atri Sharma <atri(dot)jiit(at)gmail(dot)com> wrote:
>> > Please find attached a patch which implements support for UPDATE table1
>> > SET(*)=...
>>
>> I presume you haven't read Tom Lane's proposal and discussion about
>> multiple column assignment in UPDATE:
>> http://www.postgresql.org/message-id/1783.1399054541@sss.pgh.pa.us
>> (Assigning all columns was also discussed there)
>>
>> And there's a WIP patch:
>> http://www.postgresql.org/message-id/20930.1402931841@sss.pgh.pa.us
>
> Thanks for the links, but this patch only targets SET(*) case, which, if I
> understand correctly, the patch you mentioned doesn't directly handle (If I
> understand correctly, the target of the two patches is different).

Yeah -- in fact, there was some discussion about this exact case.
This patch solves a very important problem: when doing record
operations to move data between databases with identical schema
there's currently no way to 'update' in a generic way without building
out the entire field list via complicated and nasty dynamic SQL. I'm
not sure about the proposed syntax though; it seems a little weird to
me. Any particular reason why you couldn't have just done:

UPDATE table1 SET * = a,b,c, ...

also,

UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...);

seems cleaner than the proposed syntax for row assignment. Tom
objected though IIRC.

merlin

From:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To:	Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc:	Marti Raudsepp <marti(at)juffo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-17 14:50:25
Message-ID:	CAOeZVieR_aAJEpvzsdWZmV3TrS14-732=rvNAv4s689JTPB_ew@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Oct 17, 2014 at 7:45 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:

> On Wed, Oct 15, 2014 at 3:48 AM, Atri Sharma <atri(dot)jiit(at)gmail(dot)com> wrote:
> >
> >
> > On Wednesday, October 15, 2014, Marti Raudsepp <marti(at)juffo(dot)org> wrote:
> >>
> >> Hi
> >>
> >> On Wed, Oct 15, 2014 at 11:02 AM, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
> wrote:
> >> > Please find attached a patch which implements support for UPDATE
> table1
> >> > SET(*)=...
> >>
> >> I presume you haven't read Tom Lane's proposal and discussion about
> >> multiple column assignment in UPDATE:
> >> http://www.postgresql.org/message-id/1783.1399054541@sss.pgh.pa.us
> >> (Assigning all columns was also discussed there)
> >>
> >> And there's a WIP patch:
> >> http://www.postgresql.org/message-id/20930.1402931841@sss.pgh.pa.us
> >
> > Thanks for the links, but this patch only targets SET(*) case, which, if
> I
> > understand correctly, the patch you mentioned doesn't directly handle
> (If I
> > understand correctly, the target of the two patches is different).
>
> Yeah -- in fact, there was some discussion about this exact case.
> This patch solves a very important problem: when doing record
> operations to move data between databases with identical schema
> there's currently no way to 'update' in a generic way without building
> out the entire field list via complicated and nasty dynamic SQL.

Thanks!

> I'm
> not sure about the proposed syntax though; it seems a little weird to
> me. Any particular reason why you couldn't have just done:
>
> UPDATE table1 SET * = a,b,c, ...
>
> also,
>
> UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...);
>

I honestly have not spent a lot of time thinking about the exact syntax
that may be acceptable. If we have arguments for or against a specific
syntax, I will be glad to incorporate them.

>
>

From:	Marko Tiikkaja <marko(at)joh(dot)to>
To:	Merlin Moncure <mmoncure(at)gmail(dot)com>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc:	Marti Raudsepp <marti(at)juffo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-17 14:55:17
Message-ID:	54412DD5.3090803@joh.to
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 10/17/14 4:15 PM, Merlin Moncure wrote:
> Any particular reason why you couldn't have just done:
>
> UPDATE table1 SET * = a,b,c, ...

That just looks wrong to me. I'd prefer (*) = .. over that any day.

> UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...);
>
> seems cleaner than the proposed syntax for row assignment. Tom
> objected though IIRC.

I don't know about Tom, but I didn't like that:
http://www.postgresql.org/message-id/5364C982.7060003@joh.to

.marko

From:	Merlin Moncure <mmoncure(at)gmail(dot)com>
To:	Marko Tiikkaja <marko(at)joh(dot)to>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-17 15:03:41
Message-ID:	CAHyXU0wicj-kpiO=Ytnha9Vb27j2Q9aH-ZuPLCa34jtfN1Ug+Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Oct 17, 2014 at 9:55 AM, Marko Tiikkaja <marko(at)joh(dot)to> wrote:
> On 10/17/14 4:15 PM, Merlin Moncure wrote:
>>
>> Any particular reason why you couldn't have just done:
>>
>> UPDATE table1 SET * = a,b,c, ...
>
>
> That just looks wrong to me. I'd prefer (*) = .. over that any day.
>
>> UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...);
>>
>> seems cleaner than the proposed syntax for row assignment. Tom
>> objected though IIRC.
>
> I don't know about Tom, but I didn't like that:
> http://www.postgresql.org/message-id/5364C982.7060003@joh.to

Hm, I didn't understand your objection:

<quoting>
So e.g.:
UPDATE foo f SET f = ..;

would resolve to the table, despite there being a column called "f"?
That would break backwards compatibility.
</quoting>

That's not correct: it should work exactly as 'select' does; given a
conflict resolve the field name, so no backwards compatibility issue.

merlin

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-17 15:04:27
Message-ID:	11264.1413558267@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Wed, Oct 15, 2014 at 3:48 AM, Atri Sharma <atri(dot)jiit(at)gmail(dot)com> wrote:
>> Thanks for the links, but this patch only targets SET(*) case, which, if I
>> understand correctly, the patch you mentioned doesn't directly handle (If I
>> understand correctly, the target of the two patches is different).

> Yeah -- in fact, there was some discussion about this exact case.
> This patch solves a very important problem: when doing record
> operations to move data between databases with identical schema
> there's currently no way to 'update' in a generic way without building
> out the entire field list via complicated and nasty dynamic SQL. I'm
> not sure about the proposed syntax though; it seems a little weird to
> me. Any particular reason why you couldn't have just done:

> UPDATE table1 SET * = a,b,c, ...

> also,

> UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...);

> seems cleaner than the proposed syntax for row assignment. Tom
> objected though IIRC.

That last proposal is no good because it would be ambiguous if the
table contains a column by that name. The "(*)" idea actually seems
not bad, since it's shorthand for a parenthesized column list.

I'm not sure about the patch itself though --- in particular, it
doesn't seem to me that it should be touching transformTargetList,
since that doesn't have anything to do with expansion of multiassignments
today. Probably this is a symptom of having chosen a poor representation
of the construct in the raw grammar output. However, I've not exactly
wrapped my head around what that representation is ... the lack of any
comments explaining it doesn't help.

A larger question is whether it's appropriate to do the *-expansion
at parse time, or whether we'd need to postpone it to later in order
to handle reasonable use-cases. Since we expand "SELECT *" at parse
time (and are mandated to do so by the SQL spec, I believe), it seems
consistent to do this at parse time as well; but perhaps there is a
counter argument.

regards, tom lane

From:	Marko Tiikkaja <marko(at)joh(dot)to>
To:	Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-17 15:06:44
Message-ID:	54413084.4060109@joh.to
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 10/17/14 5:03 PM, Merlin Moncure wrote:
> Hm, I didn't understand your objection:
>
> <quoting>
> So e.g.:
> UPDATE foo f SET f = ..;
>
> would resolve to the table, despite there being a column called "f"?
> That would break backwards compatibility.
> </quoting>
>
> That's not correct: it should work exactly as 'select' does; given a
> conflict resolve the field name, so no backwards compatibility issue.

local:marko=# show server_version;
server_version
----------------
9.1.13
(1 row)

local:marko=#* create table foo(f int);
CREATE TABLE
local:marko=#* update foo f set f=1;
UPDATE 0

This query would change meaning with your suggestion.

I'm not saying it would be a massive problem in practice, but I think we
should first consider options which don't break backwards compatibility,
even if some consider them "less clean".

.marko

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc:	Marko Tiikkaja <marko(at)joh(dot)to>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-17 15:10:18
Message-ID:	11315.1413558618@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Fri, Oct 17, 2014 at 9:55 AM, Marko Tiikkaja <marko(at)joh(dot)to> wrote:
>> I don't know about Tom, but I didn't like that:
>> http://www.postgresql.org/message-id/5364C982.7060003@joh.to

> Hm, I didn't understand your objection:

> <quoting>
> So e.g.:
> UPDATE foo f SET f = ..;

> would resolve to the table, despite there being a column called "f"?
> That would break backwards compatibility.
> </quoting>

> That's not correct: it should work exactly as 'select' does; given a
> conflict resolve the field name, so no backwards compatibility issue.

The point is that it's fairly messy (and bug-prone) to have a syntax
where we have to make an arbitrary choice between two reasonable
interpretations.

If you look back at the whole thread Marko's above-cited message is in,
we'd considered a bunch of different possible syntaxes for this, and
none of them had much support. The "(*)" idea actually is starting to
look pretty good to me.

regards, tom lane

From:	Merlin Moncure <mmoncure(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Marko Tiikkaja <marko(at)joh(dot)to>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-17 15:11:26
Message-ID:	CAHyXU0xfjNdnKqBJUeVveQEiyHGwqGOP_A2J+-hdgD0iM7CCcw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Oct 17, 2014 at 10:10 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> On Fri, Oct 17, 2014 at 9:55 AM, Marko Tiikkaja <marko(at)joh(dot)to> wrote:
>>> I don't know about Tom, but I didn't like that:
>>> http://www.postgresql.org/message-id/5364C982.7060003@joh.to
>
>> Hm, I didn't understand your objection:
>
>> <quoting>
>> So e.g.:
>> UPDATE foo f SET f = ..;
>
>> would resolve to the table, despite there being a column called "f"?
>> That would break backwards compatibility.
>> </quoting>
>
>> That's not correct: it should work exactly as 'select' does; given a
>> conflict resolve the field name, so no backwards compatibility issue.
>
> The point is that it's fairly messy (and bug-prone) to have a syntax
> where we have to make an arbitrary choice between two reasonable
> interpretations.
>
> If you look back at the whole thread Marko's above-cited message is in,
> we'd considered a bunch of different possible syntaxes for this, and
> none of them had much support. The "(*)" idea actually is starting to
> look pretty good to me.

Hm, I'll take it then.

merlin

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Marko Tiikkaja <marko(at)joh(dot)to>
Cc:	Merlin Moncure <mmoncure(at)gmail(dot)com>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-17 15:16:11
Message-ID:	11364.1413558971@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Marko Tiikkaja <marko(at)joh(dot)to> writes:
> local:marko=#* create table foo(f int);
> CREATE TABLE
> local:marko=#* update foo f set f=1;
> UPDATE 0

> This query would change meaning with your suggestion.

I think it wouldn't; Merlin is proposing that f would be taken as the
field name. A more realistic objection goes like this:

create table foo(f int, g int);
update foo x set x = (1,2); -- works
alter table foo add column x int;
update foo x set x = (1,2,3); -- no longer works

It's not a real good thing if a column addition or renaming can
so fundamentally change the nature of a query.

regards, tom lane

From:	Merlin Moncure <mmoncure(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Marko Tiikkaja <marko(at)joh(dot)to>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-17 15:22:11
Message-ID:	CAHyXU0y92ukn0jSpD36TGChgzK8=1WhiFvBeXDNtjRV5oHVwPg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Oct 17, 2014 at 10:16 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Marko Tiikkaja <marko(at)joh(dot)to> writes:
>> local:marko=#* create table foo(f int);
>> CREATE TABLE
>> local:marko=#* update foo f set f=1;
>> UPDATE 0
>
>> This query would change meaning with your suggestion.
>
> I think it wouldn't; Merlin is proposing that f would be taken as the
> field name. A more realistic objection goes like this:
>
> create table foo(f int, g int);
> update foo x set x = (1,2); -- works
> alter table foo add column x int;
> update foo x set x = (1,2,3); -- no longer works
>
> It's not a real good thing if a column addition or renaming can
> so fundamentally change the nature of a query.

That's exactly how SELECT works. I also dispute that the user should
be surprised in such cases.

merlin

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc:	Marko Tiikkaja <marko(at)joh(dot)to>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-17 15:30:51
Message-ID:	11801.1413559851@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Fri, Oct 17, 2014 at 10:16 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I think it wouldn't; Merlin is proposing that f would be taken as the
>> field name. A more realistic objection goes like this:
>>
>> create table foo(f int, g int);
>> update foo x set x = (1,2); -- works
>> alter table foo add column x int;
>> update foo x set x = (1,2,3); -- no longer works
>>
>> It's not a real good thing if a column addition or renaming can
>> so fundamentally change the nature of a query.

> That's exactly how SELECT works. I also dispute that the user should
> be surprised in such cases.

Well, the reason we have a problem in SELECT is that we support an
unholy miscegenation of SQL-spec and PostQUEL syntax: the idea that
"SELECT foo FROM foo" could represent a whole-row selection is nowhere
to be found in the SQL standard, for good reason IMO. But we've never
had the courage to break cleanly with this Berkeley leftover and
insist that people spell it SQL-style as "SELECT ROW(foo.*) FROM foo".
I'd just as soon not introduce the same kind of ambiguity into UPDATE
if we have a reasonable alternative.

regards, tom lane

From:	Merlin Moncure <mmoncure(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Marko Tiikkaja <marko(at)joh(dot)to>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-17 15:47:57
Message-ID:	CAHyXU0zz7MLpH99dLBMCK=UpdVjxWtChtQf658Pg3MpXUi+2bg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Oct 17, 2014 at 10:30 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> On Fri, Oct 17, 2014 at 10:16 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> I think it wouldn't; Merlin is proposing that f would be taken as the
>>> field name. A more realistic objection goes like this:
>>>
>>> create table foo(f int, g int);
>>> update foo x set x = (1,2); -- works
>>> alter table foo add column x int;
>>> update foo x set x = (1,2,3); -- no longer works
>>>
>>> It's not a real good thing if a column addition or renaming can
>>> so fundamentally change the nature of a query.
>
>> That's exactly how SELECT works. I also dispute that the user should
>> be surprised in such cases.
>
> Well, the reason we have a problem in SELECT is that we support an
> unholy miscegenation of SQL-spec and PostQUEL syntax: the idea that
> "SELECT foo FROM foo" could represent a whole-row selection is nowhere
> to be found in the SQL standard, for good reason IMO. But we've never
> had the courage to break cleanly with this Berkeley leftover and
> insist that people spell it SQL-style as "SELECT ROW(foo.*) FROM foo".
> I'd just as soon not introduce the same kind of ambiguity into UPDATE
> if we have a reasonable alternative.

Ah, interesting point (I happen to like the terse syntax and use it
often). This is for posterity anyways since you guys seem to like
Atri's proposal, which surprised me. However, I think you're over
simplifying things here. Syntax aside: I think
SELECT f FROM foo f;
and a hypothetical
SELECT row(f.*) FROM foo f;

give different semantics. The former gives an object of type 'f' and
the latter gives type 'row'. To get parity, you'd have to add an
extra cast which means you'd have to play tricky games to avoid extra
performance overhead besides being significantly more verbose. I'm
aware some of the other QUELisms are pretty dodgy and have burned us
in the past (like that whole function as record reference thing) but
pulling a record as a field in select is pretty nice. It's also
widely used and quite useful in json serialization.

merlin

From:	Marti Raudsepp <marti(at)juffo(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-17 16:05:09
Message-ID:	CABRT9RAaA44A=7O8armYFmmXoRTW_0P20acazTdds_=_hoGWJA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Oct 17, 2014 6:16 PM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> A more realistic objection goes like this:
>
> create table foo(f int, g int);
> update foo x set x = (1,2); -- works
> alter table foo add column x int;
> update foo x set x = (1,2,3); -- no longer works
>
> It's not a real good thing if a column addition or renaming can
> so fundamentally change the nature of a query.

I think a significant use case for this feature is when you already have a
row-value and want to persist it in the database, like you can do with
INSERT:
insert into foo select * from populate_record_json(null::foo, '{...}');

In this case the opposite is true: requiring explicit column names would
break the query if you add columns to the table. The fact that you can't
reasonably use populate_record/_json with UPDATE is a significant omission.
IMO this really speaks for supporting shorthand whole-row assignment,
whatever the syntax.

Regards,
Marti

From:	Merlin Moncure <mmoncure(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Marko Tiikkaja <marko(at)joh(dot)to>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-22 15:05:29
Message-ID:	CAHyXU0zSSxJgpZAMFP5phucfsycgcr62kQK+FqOYmi6BfueO4g@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Oct 17, 2014 at 10:47 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Fri, Oct 17, 2014 at 10:30 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>> On Fri, Oct 17, 2014 at 10:16 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> I think it wouldn't; Merlin is proposing that f would be taken as the
>>>> field name. A more realistic objection goes like this:
>>>>
>>>> create table foo(f int, g int);
>>>> update foo x set x = (1,2); -- works
>>>> alter table foo add column x int;
>>>> update foo x set x = (1,2,3); -- no longer works
>>>>
>>>> It's not a real good thing if a column addition or renaming can
>>>> so fundamentally change the nature of a query.
>>
>>> That's exactly how SELECT works. I also dispute that the user should
>>> be surprised in such cases.
>>
>> Well, the reason we have a problem in SELECT is that we support an
>> unholy miscegenation of SQL-spec and PostQUEL syntax: the idea that
>> "SELECT foo FROM foo" could represent a whole-row selection is nowhere
>> to be found in the SQL standard, for good reason IMO. But we've never
>> had the courage to break cleanly with this Berkeley leftover and
>> insist that people spell it SQL-style as "SELECT ROW(foo.*) FROM foo".
>> I'd just as soon not introduce the same kind of ambiguity into UPDATE
>> if we have a reasonable alternative.
>
> Ah, interesting point (I happen to like the terse syntax and use it
> often). This is for posterity anyways since you guys seem to like
> Atri's proposal, which surprised me. However, I think you're over
> simplifying things here. Syntax aside: I think
> SELECT f FROM foo f;
> and a hypothetical
> SELECT row(f.*) FROM foo f;
>
> give different semantics. The former gives an object of type 'f' and
> the latter gives type 'row'. To get parity, you'd have to add an
> extra cast which means you'd have to play tricky games to avoid extra
> performance overhead besides being significantly more verbose. I'm
> aware some of the other QUELisms are pretty dodgy and have burned us
> in the past (like that whole function as record reference thing) but
> pulling a record as a field in select is pretty nice. It's also
> widely used and quite useful in json serialization.

Been thinking about this in the back of my mind the last couple of
days. There are some things you can do with the QUEL 'table as
column' in select syntax that are impossible otherwise, at least
today, and its usage is going to proliferate because of that. Row
construction via () or row() needs to be avoided whenever the column
names are important and there is no handy type to cast to. For
example, especially during json serialization it's convenient to do
things like:

select
array_agg((select q from (select a, b) q) order by ...)
from foo;

...where a,b are fields of foo. FWICT, this is the only way to do
json serialization of arbitrary runtime row constructions in a way
that does not anonymize the type. Until I figured out this trick I
used to create lots of composite types that served no purpose other
than to give me a type to cast to which is understandably annoying.

if:
select (x).* from (select (1, 2) as x) q;

worked and properly expanded x to given names should they exist and:
SELECT row(f.*) FROM foo f;

worked and did same, and:
SELECT (row(f.*)).* FROM foo f;

was as reasonably performant and gave the same results as:
SELECT (f).* FROM foo f;

...then IMSNHO you'd have a reasonable path to deprecating the QUEL
inspired syntax. Food for thought.

merlin

From:	Marko Tiikkaja <marko(at)joh(dot)to>
To:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-10-29 21:12:05
Message-ID:	54515825.2030607@joh.to
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi Atri,

Sorry for the delay. With pgconf.eu and all, it's been very hard to
find the time to look at this.

On 10/15/14, 10:02 AM, Atri Sharma wrote:
> Please find attached a patch which implements support for UPDATE table1
> SET(*)=...
> The patch supports both UPDATE table SET(*)=(a,b,c) and UPDATE table1
> SET(*)=(SELECT a,b,c FROM...).
> It solves the problem of doing UPDATE from a record variable of the same
> type as the table e.g. update foo set (*) = (select foorec.*) where ...;

Excellent! This is a very welcome change.

I've had a few looks at this patch and I have a few comments:

1) This doesn't work for the zero-column table case at all:
CREATE TABLE foo();
UPDATE foo SET (*) = (SELECT);
ERROR: number of columns does not match number of values
2) What's the purpose of the second condition here?
if (!(origTarget) || !(origTarget->name))
3) The extra parentheses around everything make this code for some
reason very hard to read.
4) transformTargetList() is a mess right now. If this is the
approach we want to take, the common code should probably be refactored
into a function. But the usage of List as a somehow magical way to
represent the SET (*) case makes me feel weird inside.
5) The complete lack of regression tests make it hard to poke around
the code to try and figure out what each line/condition is trying to do.

I feel like I understand what this code is doing and some details feel a
bit icky, but I'm not the right person to comment on whether the broad
strokes are on the right canvas or not. Maybe someone else wants to
take a closer look before Atri spends too much time on this approach?
After all, this patch has a very distinctive WIP feel to it, so I guess
feedback on the general approach is what's being sought after here, and
in that area I consider my skills and knowledge lacking.

> The design is simple. It basically expands the * in transformation stage,
> does the necessary type checking and adds it to the parse tree. This allows
> for normal execution for the rest of the stages.

I can't poke any big holes into this approach (disregarding the details
of this implementation), but perhaps someone else can?

.marko

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Marko Tiikkaja <marko(at)joh(dot)to>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-11-25 19:26:59
Message-ID:	1484.1416943619@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Marko Tiikkaja <marko(at)joh(dot)to> writes:
> On 10/15/14, 10:02 AM, Atri Sharma wrote:
>> Please find attached a patch which implements support for UPDATE table1
>> SET(*)=...

> I've had a few looks at this patch and I have a few comments:

> 1) This doesn't work for the zero-column table case at all:
> CREATE TABLE foo();
> UPDATE foo SET (*) = (SELECT);
> ERROR: number of columns does not match number of values
> 2) What's the purpose of the second condition here?
> if (!(origTarget) || !(origTarget->name))
> 3) The extra parentheses around everything make this code for some
> reason very hard to read.
> 4) transformTargetList() is a mess right now. If this is the
> approach we want to take, the common code should probably be refactored
> into a function. But the usage of List as a somehow magical way to
> represent the SET (*) case makes me feel weird inside.
> 5) The complete lack of regression tests make it hard to poke around
> the code to try and figure out what each line/condition is trying to do.

> I feel like I understand what this code is doing and some details feel a
> bit icky, but I'm not the right person to comment on whether the broad
> strokes are on the right canvas or not. Maybe someone else wants to
> take a closer look before Atri spends too much time on this approach?

FWIW, I opined upthread that transformTargetList was not the place to
be handling this; it should be done in the same place(s) that support
the existing MultiAssignRef feature, and transformTargetList is not
that.

I think what's likely missing here is a clear design for the raw parse
tree representation (what's returned by the bison grammar). The patch
seems to be trying to skate by without creating any new parse node types
or fields, but that may well be a bad idea. At the very least there
needs to be some commentary added to parsenodes.h explaining what the
representation actually is (cf commentary there for MultiAssignRef).

Also, I think it's a mistake not to be following the MultiAssignRef
model for the case of "(*) = ctext_row". We optimize the ROW-source
case at the grammar stage when there's a fixed number of target columns,
because that's a very easy transformation --- but you can't do it like
that when there's not. It's possible that that optimization should be
delayed till later even in the existing case; in general, optimizing
in gram.y is a bad habit that's better avoided ...

regards, tom lane

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Marko Tiikkaja <marko(at)joh(dot)to>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-12-08 00:36:10
Message-ID:	CAB7nPqTR+LJrKYo4X-S28+zc5Hxp=PSvrbCsR1FbFrQdmS_Jng@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Nov 26, 2014 at 4:26 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I think what's likely missing here is a clear design for the raw parse
> tree representation (what's returned by the bison grammar). The patch
> seems to be trying to skate by without creating any new parse node types
> or fields, but that may well be a bad idea. At the very least there
> needs to be some commentary added to parsenodes.h explaining what the
> representation actually is (cf commentary there for MultiAssignRef).
>
> Also, I think it's a mistake not to be following the MultiAssignRef
> model for the case of "(*) = ctext_row". We optimize the ROW-source
> case at the grammar stage when there's a fixed number of target columns,
> because that's a very easy transformation --- but you can't do it like
> that when there's not. It's possible that that optimization should be
> delayed till later even in the existing case; in general, optimizing
> in gram.y is a bad habit that's better avoided ...
Marking as "returned with feedback" based on those comments.
--
Michael

From:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2014-12-15 10:50:26
Message-ID:	CAOeZVid1iuz7K-hp3-Vr5Ec_cUHdY=fYzWNu+yqAMUTruFKQOw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Dec 8, 2014 at 6:06 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:
>
> On Wed, Nov 26, 2014 at 4:26 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > I think what's likely missing here is a clear design for the raw parse
> > tree representation (what's returned by the bison grammar). The patch
> > seems to be trying to skate by without creating any new parse node types
> > or fields, but that may well be a bad idea. At the very least there
> > needs to be some commentary added to parsenodes.h explaining what the
> > representation actually is (cf commentary there for MultiAssignRef).
> >
> > Also, I think it's a mistake not to be following the MultiAssignRef
> > model for the case of "(*) = ctext_row". We optimize the ROW-source
> > case at the grammar stage when there's a fixed number of target columns,
> > because that's a very easy transformation --- but you can't do it like
> > that when there's not. It's possible that that optimization should be
> > delayed till later even in the existing case; in general, optimizing
> > in gram.y is a bad habit that's better avoided ...
> Marking as "returned with feedback" based on those comments.
>
>
Thank you everybody for your comments.

I am indeed trying to avoid a new node since my intuitive feeling says that
we do not need a new node for a functionality which is present in other
commands albeit in different form. However, I agree that documentation
explaining parse representation is lacking and shall improve that. Also, in
accordance to Tom's comments, I shall look for not touching target list
directly and follow the path of MultiAssignRef.

I have moved patch to current CF and marked it as Waiting on Author since I
plan to resubmit after addressing the concerns.

Regards,

Atri

--
Regards,

Atri
*l'apprenant*

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-01-15 07:51:33
Message-ID:	CAB7nPqRRaepLmSsZMH4AxMkhC=KGMUUK40Q0n+rdT0vJv1bcJA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Dec 15, 2014 at 7:50 PM, Atri Sharma <atri(dot)jiit(at)gmail(dot)com> wrote:
> I have moved patch to current CF and marked it as Waiting on Author since I
> plan to resubmit after addressing the concerns.
Nothing happened in the last month, so marking as returned with feedback.
--
Michael

From:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-02-14 04:33:05
Message-ID:	CAOeZVidJrde8AOFJALzfWJsfO+6U-cwTNucJUfaHK+Y7p_Ga5A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi all,

Sorry for the delay.

Please find attached latest version of UPDATE (*) patch.The patch
implements review comments and Tom's gripe about touching
transformTargetList is addressed now. I have added regression tests and
simplified parser representation a bit.

Regards,

Atri

Attachment	Content-Type	Size
updatestar14215.patch	text/x-diff	12.1 KB

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-04-07 16:53:44
Message-ID:	24065.1428425624@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Atri Sharma <atri(dot)jiit(at)gmail(dot)com> writes:
> Please find attached latest version of UPDATE (*) patch.The patch
> implements review comments and Tom's gripe about touching
> transformTargetList is addressed now. I have added regression tests and
> simplified parser representation a bit.

I spent a fair amount of time cleaning this patch up to get it into
committable shape, but as I was working on the documentation I started
to lose enthusiasm for it, because I was having a hard time coming up
with compelling examples. The originally proposed motivation was

>> It solves the problem of doing UPDATE from a record variable of the same
>> type as the table e.g. update foo set (*) = (select foorec.*) where ...;

but it feels to me that this is not actually a very good solution to that
problem --- even granting that the problem needs to be solved, a claim
that still requires some justification IMO. Here is a possible use-case:

regression=# create table src (f1 int, f2 text, f3 real);
CREATE TABLE
regression=# create table dst (f1 int, f2 text, f3 real);
CREATE TABLE
regression=# create rule r1 as on update to src do instead
regression-# update dst set (*) = (new.*) where dst.f1 = old.f1;
ERROR: number of columns does not match number of values
LINE 2: update dst set (*) = (new.*) where dst.f1 = old.f1;
^

So that's annoying. You can work around it with this very unobvious
(and undocumented) syntax hack:

regression=# create rule r1 as on update to src do instead
regression-# update dst set (*) = (select new.*) where dst.f1 = old.f1;
CREATE RULE

But what happens if dst has no matching row? Your data goes into the
bit bucket, that's what. What you really wanted here was some kind of
UPSERT. There's certainly a possible use-case for a notation like this
in UPSERT, when we get around to implementing that; but I'm less than
convinced that we need it in plain UPDATE.

What's much worse though is that the rule that actually gets stored is:

That is, you don't actually have any protection at all against future
additions of columns, which seems like it would be most of the point
of using a notation like this.

We could possibly address that by postponing expansion of the *-notation
to rule rewrite time, but that seems like it'd be a complete disaster in
terms of modularity, or even usability (since column-mismatch errors
wouldn't get raised till then either). And it'd certainly be a far more
invasive and complex patch than this.

So I'm feeling that this may not be a good idea, or at least not a good
implementation of the idea. I'm inclined to reject the patch rather than
lock us into something that is not standard and doesn't really do what
people would be likely to want.

Attached is the updated patch that I had before arriving at this
discouraging conclusion.

regards, tom lane

Attachment	Content-Type	Size
updatestar_v3.patch	text/x-diff	20.3 KB

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-04-07 18:19:15
Message-ID:	26496.1428430755@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I wrote:
> So I'm feeling that this may not be a good idea, or at least not a good
> implementation of the idea. I'm inclined to reject the patch rather than
> lock us into something that is not standard and doesn't really do what
> people would be likely to want.

BTW, a potentially workable fix to the problem of not wanting to lock
down column lists in stored rules is to create a syntax that represents
whole-row, record-oriented assignment directly. Then we need not be
concerned with individual columns at parse time at all. So imagine
something like this:

UPDATE dst SET * = new WHERE ...;

UPDATE dst SET * = (SELECT src FROM src WHERE ...);

or if you needed to construct a row value at runtime you could write

UPDATE dst SET * = ROW(x,y,z) WHERE ...;

UPDATE dst SET * = (SELECT ROW(x,y,z) FROM src WHERE ...);

The main bit of functionality that would be lost compared to the current
patch is the ability to use DEFAULT for some of the row members. But I am
not sure there is a compelling use-case for that: seems like if you have
some DEFAULTs in there then it's unlikely that you don't know the column
list accurately, so the existing (col1,col2,...) syntax will serve fine.

This seems like it might not be unduly complex to implement, although
it would have roughly nothing in common with the current patch.

If we were to go in this direction, it would be nice to at the same time
add a similar whole-record syntax for INSERT. I'm not sure exactly what
that should look like though. Also, again, we ought to be paying
attention to how this would match up with UPSERT syntax.

regards, tom lane

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-04-07 19:00:44
Message-ID:	20150407190044.GL4369@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:

> I spent a fair amount of time cleaning this patch up to get it into
> committable shape, but as I was working on the documentation I started
> to lose enthusiasm for it, because I was having a hard time coming up
> with compelling examples. The originally proposed motivation was
>
> >> It solves the problem of doing UPDATE from a record variable of the same
> >> type as the table e.g. update foo set (*) = (select foorec.*) where ...;
>
> but it feels to me that this is not actually a very good solution to that
> problem --- even granting that the problem needs to be solved, a claim
> that still requires some justification IMO.

How about an UPDATE ran inside a plpgsql function, which is using a row
variable of the table type? You could assign values to individual
columns of q row variable, and run the multicolumn UPDATE last.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Peter Geoghegan <pg(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-04-07 19:01:38
Message-ID:	CAM3SWZR8QsqorECT3AgT4pHAOaUbuR90TZsQYZqYR7nu1HMzLQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Apr 7, 2015 at 11:19 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> If we were to go in this direction, it would be nice to at the same time
> add a similar whole-record syntax for INSERT. I'm not sure exactly what
> that should look like though. Also, again, we ought to be paying
> attention to how this would match up with UPSERT syntax.

I expressed concern about allowing this for UPSERT [1].

To be fair, VoltDB's new UPSERT statement allows something similar (or
rather mandates it, since you cannot just update some columns), and
that doesn't look wholly unreasonable. I still don't like the idea of
supporting this, though. I'm not aware of any other system allowing
something like this for either MERGE or a non-standard UPSERT.

[1] http://www.postgresql.org/message-id/CAM3SWZT=VXBJ7QKAidAmYbU40aP10udSqOOqhViX3Ykj7WBv9A@mail.gmail.com
--
Peter Geoghegan

From:	Peter Geoghegan <pg(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-04-07 19:04:50
Message-ID:	CAM3SWZRCAsFo4d4v6NYwOEHXXQj4teqTf-ft9mUBjmzpP6uMYg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Apr 7, 2015 at 12:01 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> I still don't like the idea of
> supporting this, though. I'm not aware of any other system allowing
> something like this for either MERGE or a non-standard UPSERT.

That includes MySQL, BTW. Only their REPLACE statement (which is a
disaster for various reasons) can do something like this. Their INSERT
... ON DUPLICATE KEY UPDATE statement (which is roughly comparable to
the proposed UPSERT patch) cannot update the entire row using a terse
expression that references a row excluded from insertion (following
the implementation taking the UPDATE path).

--
Peter Geoghegan

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Peter Geoghegan <pg(at)heroku(dot)com>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-04-07 19:22:18
Message-ID:	27879.1428434538@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Peter Geoghegan <pg(at)heroku(dot)com> writes:
> On Tue, Apr 7, 2015 at 11:19 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> If we were to go in this direction, it would be nice to at the same time
>> add a similar whole-record syntax for INSERT. I'm not sure exactly what
>> that should look like though. Also, again, we ought to be paying
>> attention to how this would match up with UPSERT syntax.

> I expressed concern about allowing this for UPSERT [1].

Yeah, your analogy to "SELECT *" being considered dangerous in production
is not without merit. However, to the extent that the syntax is used to
assign from a composite variable of the same (or compatible) data type,
it seems like it would be safe enough. IOW, I think that analogy holds
for the syntax implemented by the current patch, but not what I suggested
in my followup.

regards, tom lane

From:	Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-04-07 22:36:31
Message-ID:	55245BEF.6050607@BlueTreble.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 4/7/15 2:00 PM, Alvaro Herrera wrote:
> Tom Lane wrote:
>
>> I spent a fair amount of time cleaning this patch up to get it into
>> committable shape, but as I was working on the documentation I started
>> to lose enthusiasm for it, because I was having a hard time coming up
>> with compelling examples. The originally proposed motivation was
>>
>>>> It solves the problem of doing UPDATE from a record variable of the same
>>>> type as the table e.g. update foo set (*) = (select foorec.*) where ...;
>>
>> but it feels to me that this is not actually a very good solution to that
>> problem --- even granting that the problem needs to be solved, a claim
>> that still requires some justification IMO.
>
> How about an UPDATE ran inside a plpgsql function, which is using a row
> variable of the table type? You could assign values to individual
> columns of q row variable, and run the multicolumn UPDATE last.

Along similar lines, I've often wanted something akin to *, but allowing
finer control over what you got. Generally when I want this it's because
I really do want everything (as in, don't want to re-code a bunch of
stuff if a column is added), but perhaps not the surrogate key field. Or
I want everything, but rename some field to something else.

Basically, another way to do what Alvaro is suggesting (though, the
ability to rename something is new...)

If we had that ability I think UPDATE * would become a lot more useful.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

From:	Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-04-08 09:44:26
Message-ID:	87h9srouxu.fsf@news-spur.riddles.org.uk
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> I spent a fair amount of time cleaning this patch up to get it
Tom> into committable shape, but as I was working on the documentation
Tom> I started to lose enthusiasm for it, because I was having a hard
Tom> time coming up with compelling examples. The originally proposed
Tom> motivation was

>>> It solves the problem of doing UPDATE from a record variable of the same
>>> type as the table e.g. update foo set (*) = (select foorec.*) where ...;

There are a number of motivating examples for this (which have nothing
to do with rules; I doubt anyone cares much about that).

The fundamental point is that currently, given a table "foo" and some
column or variable of foo's rowtype, you can do this:

insert into foo select foorec.* [from ...]

but there is no comparable way to do an update without naming every
column explicitly, the closest being

update foo set (a,b,...) = (foorec.a, foorec.b, ...) where ...

One example that comes up occasionally (and that I've had to do myself
more than once) is this: given a table "foo" and another with identical
schema "reference_foo", apply appropriate inserts, updates and deletes
to table "foo" to make the content of the two tables identical. This can
be done these days with wCTEs:

with
t_diff as (select o.id as o_id, n.id as n_id, o, n
from foo o full outer join reference_foo n on (o.id=n.id)
where (o.*) is distinct from (n.*)),
ins as (insert into foo select (n).* from t_diff where o_id is null),
del as (delete from foo
where id in (select o_id from t_diff where n_id is null)),
upd as (update foo
set (col1,col2,...) = ((n).col1,(n).col2,...) -- XXX
from t_diff
where foo.id = n_id and o_id = n_id)
select count(*) filter (where o_id is null) as num_ins,
count(*) filter (where o_id = n_id) as num_upd,
count(*) filter (where n_id is null) as num_del
from t_diff;

(This would be preferred over simply replacing the table content if the
table is large and the changes few, you want to audit the changes, you
need to avoid interfering with concurrent selects, etc.)

The update part of that would be much improved by simply being able to
say "update all columns of foo with values from (n)". The exact syntax
isn't a big deal - though since SET (cols...) = ... is in the spec, it
seems reasonable to at least keep some kind of consistency with it.

Other examples arise from things one might want to do in plpgsql; for
example to update a record from an hstore or json value, one can use
[json_]populate_record to construct a record variable, but then it's
back to naming all the columns in order to actually perform the update
statement.

[My connection with this patch is only that I suggested it to Atri as a
possible project for him to do, because I wanted the feature and knew
others did also, and helped explain how the existing MultiAssign worked
and some of the criticism. I did not contribute any code.]

--
Andrew (irc:RhodiumToad)

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-04-08 14:57:43
Message-ID:	6296.1428505063@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Tom> I spent a fair amount of time cleaning this patch up to get it
> Tom> into committable shape, but as I was working on the documentation
> Tom> I started to lose enthusiasm for it, because I was having a hard
> Tom> time coming up with compelling examples.

> One example that comes up occasionally (and that I've had to do myself
> more than once) is this: given a table "foo" and another with identical
> schema "reference_foo", apply appropriate inserts, updates and deletes
> to table "foo" to make the content of the two tables identical. This can
> be done these days with wCTEs:

> with
> t_diff as (select o.id as o_id, n.id as n_id, o, n
> from foo o full outer join reference_foo n on (o.id=n.id)
> where (o.*) is distinct from (n.*)),
> ins as (insert into foo select (n).* from t_diff where o_id is null),
> del as (delete from foo
> where id in (select o_id from t_diff where n_id is null)),
> upd as (update foo
> set (col1,col2,...) = ((n).col1,(n).col2,...) -- XXX
> from t_diff
> where foo.id = n_id and o_id = n_id)
> select count(*) filter (where o_id is null) as num_ins,
> count(*) filter (where o_id = n_id) as num_upd,
> count(*) filter (where n_id is null) as num_del
> from t_diff;

While I agree that the UPDATE part of that desperately needs improvement,
I don't agree that the INSERT part is entirely fine. You're still relying
on a parse-time expansion of the (n).* notation, which is inefficient and
not at all robust against schema changes (the same problem as with the
patch's approach to UPDATE). So if we're taking this as a motivating
example, I'd want to see a fix that allows both INSERT and UPDATE directly
from a composite value of proper rowtype, without any expansion to
individual columns at all.

Perhaps we could adopt some syntax like
INSERT INTO table (*) values-or-select
to represent the case that the values-or-select delivers a single
composite column of the appropriate type.

> Other examples arise from things one might want to do in plpgsql; for
> example to update a record from an hstore or json value, one can use
> [json_]populate_record to construct a record variable, but then it's
> back to naming all the columns in order to actually perform the update
> statement.

Sure, but the patch as given didn't work very well for that either,
at least not if you wanted to avoid multiple evaluation of the
composite-returning function. You'd have to adopt some obscure syntax
like "UPDATE target SET (*) = (SELECT * FROM composite_function(...))".
With what I'm thinking about now you could do
UPDATE target SET * = composite_function(...)
which is a good deal less notation, and with a bit of luck it would not
require disassembling and reassembling the function's output tuple.

regards, tom lane

From:	Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-04-08 15:44:17
Message-ID:	87vbh6oe1h.fsf@news-spur.riddles.org.uk
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

>> One example that comes up occasionally (and that I've had to do
>> myself more than once) is this: given a table "foo" and another with
>> identical schema "reference_foo", apply appropriate inserts, updates
>> and deletes to table "foo" to make the content of the two tables
>> identical. This can be done these days with wCTEs:

>> with
>> t_diff as (select o.id as o_id, n.id as n_id, o, n
>> from foo o full outer join reference_foo n on (o.id=n.id)
>> where (o.*) is distinct from (n.*)),
>> ins as (insert into foo select (n).* from t_diff where o_id is null),
>> del as (delete from foo
>> where id in (select o_id from t_diff where n_id is null)),
>> upd as (update foo
>> set (col1,col2,...) = ((n).col1,(n).col2,...) -- XXX
>> from t_diff
>> where foo.id = n_id and o_id = n_id)
>> select count(*) filter (where o_id is null) as num_ins,
>> count(*) filter (where o_id = n_id) as num_upd,
>> count(*) filter (where n_id is null) as num_del
>> from t_diff;

Tom> While I agree that the UPDATE part of that desperately needs
Tom> improvement, I don't agree that the INSERT part is entirely fine.
Tom> You're still relying on a parse-time expansion of the (n).*
Tom> notation, which is inefficient

Not in my experience a huge deal given what else is going on...

Tom> and not at all robust against schema changes (the same problem as
Tom> with the patch's approach to UPDATE).

Now this I think is wrong; I think it's just as robust against schema
changes as using the composite value directly would be. Consider the
case where foo and reference_foo match with the exception of dropped
columns; the code as it is should just work, while a variant that used
the composite values would have to explicitly deal with that.

(When I've used this kind of operation in practice, reference_foo has
just been created using CREATE TEMP TABLE reference_foo (LIKE foo); and
then populated via COPY from an external data source. Even if
reference_foo were a non-temp table, the logic of the situation requires
it to have the same schema as foo as far as SQL statements are
concerned.)

Tom> So if we're taking this as a motivating example, I'd want to see a
Tom> fix that allows both INSERT and UPDATE directly from a composite
Tom> value of proper rowtype, without any expansion to individual
Tom> columns at all.

I would argue that this is a case of the perfect being the enemy of the
good.

Tom> Perhaps we could adopt some syntax like
Tom> INSERT INTO table (*) values-or-select
Tom> to represent the case that the values-or-select delivers a single
Tom> composite column of the appropriate type.

We could, but I think in all practical cases it'll be nothing more than
a small performance optimization rather than something that really
benefits people in terms of enhanced functionality.

>> Other examples arise from things one might want to do in plpgsql; for
>> example to update a record from an hstore or json value, one can use
>> [json_]populate_record to construct a record variable, but then it's
>> back to naming all the columns in order to actually perform the update
>> statement.

Tom> Sure, but the patch as given didn't work very well for that
Tom> either,

Partly that's a limitation resulting from how much can be done with the
existing SET (...) = syntax and implementation without ripping it all
out and starting over. An incremental improvement seemed to be a more
readily achievable goal.

In practice it would indeed probably look like:

declare
new_id integer;
new_values hstore;
begin
/* do stuff */
update foo
set (*) = (select * from populate_record(foo, new_values))
where foo.id = new_id;

A agree that it would be nicer to do

update foo
set (*) = populate_record(foo, new_values)
where foo.id = new_id;

but that would be a substantially larger project. The alternative of

set * = populate_record(foo, new_values)

is less satisfactory since it introduces inconsistencies with the case
where you _do_ want to specify explicit columns, unless you also allow

set (a,b) = row_value

which is required by the spec for T641 but which we don't currently
have.

--
Andrew (irc:RhodiumToad)

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-04-08 16:05:02
Message-ID:	7966.1428509102@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Tom> and not at all robust against schema changes (the same problem as
> Tom> with the patch's approach to UPDATE).

> Now this I think is wrong; I think it's just as robust against schema
> changes as using the composite value directly would be. Consider the
> case where foo and reference_foo match with the exception of dropped
> columns; the code as it is should just work, while a variant that used
> the composite values would have to explicitly deal with that.

AFAICS you've got that backwards.

As I illustrated upthread, after parse-time expansion we would have a
command that explicitly tells the system to insert or update only the
enumerated columns. That will *not* work as desired if columns are added
later, and (if it's in a rule) I would expect the system to have to fail
a DROP command trying to drop one of those named columns, the same way
you can't drop a column referenced in a view/rule today.

On the other hand, if it's a composite-type assignment, then dealing
with things like dropped columns becomes the system's responsibility,
and we already have code for that sort of thing.

> ... The alternative of
> set * = populate_record(foo, new_values)
> is less satisfactory since it introduces inconsistencies with the case
> where you _do_ want to specify explicit columns, unless you also allow
> set (a,b) = row_value
> which is required by the spec for T641 but which we don't currently
> have.

Right, but we should be trying to move in that direction. I see your
point though that (*) is more notationally consistent with that case.
Maybe we should be looking at trying to implement T641 in full and then
including (*) as a special case of that.

Anyway, my core point here is that we should avoid parse-time expansion
of the column set. The *only* reason for doing that is that it makes the
patch simpler, not that there is some functional advantage; in fact there
are considerable functional disadvantages as we've just been through.
So I do not want to commit a patch that institutionalizes those
disadvantages just for short-term implementation simplicity.

regards, tom lane

From:	Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-04-08 16:24:22
Message-ID:	87r3ruob65.fsf@news-spur.riddles.org.uk
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

>> Now this I think is wrong; I think it's just as robust against
>> schema changes as using the composite value directly would
>> be. Consider the case where foo and reference_foo match with the
>> exception of dropped columns; the code as it is should just work,
>> while a variant that used the composite values would have to
>> explicitly deal with that.

Tom> AFAICS you've got that backwards.

Tom> As I illustrated upthread, after parse-time expansion we would
Tom> have a command that explicitly tells the system to insert or
Tom> update only the enumerated columns. That will *not* work as
Tom> desired if columns are added later,

Where "later" is between parse analysis and execution - and if this
query is not in a rule, then any such schema change will force a
re-analysis if it's a prepared statement, no? and otherwise, we have the
tables locked against schema changes anyway? Is there a failure case
here that doesn't involve rules?

Tom> and (if it's in a rule)

well, the example I gave is not something that anyone in their right
minds would try and put in a rule.

>> ... The alternative of
>> set * = populate_record(foo, new_values)
>> is less satisfactory since it introduces inconsistencies with the case
>> where you _do_ want to specify explicit columns, unless you also allow
>> set (a,b) = row_value
>> which is required by the spec for T641 but which we don't currently
>> have.

Tom> Right, but we should be trying to move in that direction. I see
Tom> your point though that (*) is more notationally consistent with
Tom> that case. Maybe we should be looking at trying to implement T641
Tom> in full and then including (*) as a special case of that.

I would have liked to have done that, but that would have raised the
complexity of the project from "Atri can take a stab at this one with
negligible supervision" to "Andrew will have to spend more time than he
has conveniently available staring at the raw parser to try and make it
work".

As I said, the perfect is the enemy of the good.

Tom> Anyway, my core point here is that we should avoid parse-time
Tom> expansion of the column set.

Parse-time expansion of * is pretty widespread elsewhere. Changing that
for this one specific case seems a bit marginal to me - and if the main
motivation to do so is to support cases in DML rules, which are already
a foot-bazooka, I think it's honestly not worth it.

--
Andrew (irc:RhodiumToad)

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-04-08 16:33:15
Message-ID:	20150408163315.GS4369@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Gierth wrote:
> >>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Tom> Right, but we should be trying to move in that direction. I see
> Tom> your point though that (*) is more notationally consistent with
> Tom> that case. Maybe we should be looking at trying to implement T641
> Tom> in full and then including (*) as a special case of that.
>
> I would have liked to have done that, but that would have raised the
> complexity of the project from "Atri can take a stab at this one with
> negligible supervision" to "Andrew will have to spend more time than he
> has conveniently available staring at the raw parser to try and make it
> work".

Not to mention that, at this stage, we should be looking at reducing the
scope of patches in commitfest rather than enlarge it.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-04-08 16:37:29
Message-ID:	8746.1428511049@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Andrew Gierth wrote:
> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>>> Tom> Right, but we should be trying to move in that direction. I see
>>> Tom> your point though that (*) is more notationally consistent with
>>> Tom> that case. Maybe we should be looking at trying to implement T641
>>> Tom> in full and then including (*) as a special case of that.

>> I would have liked to have done that, but that would have raised the
>> complexity of the project from "Atri can take a stab at this one with
>> negligible supervision" to "Andrew will have to spend more time than he
>> has conveniently available staring at the raw parser to try and make it
>> work".

Well, we've never considered implementation convenience to be more
important than getting it right, and this doesn't seem like a place
to start.

(FWIW, the raw-parser changes would be a negligible fraction of the work
involved to do it like that.)

> Not to mention that, at this stage, we should be looking at reducing the
> scope of patches in commitfest rather than enlarge it.

I already took it out of the current commitfest ;-).

regards, tom lane

From:	Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support UPDATE table SET(*)=...
Date:	2015-04-10 11:01:21
Message-ID:	87oamwjlw2.fsf@news-spur.riddles.org.uk
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> Well, we've never considered implementation convenience to be more
Tom> important than getting it right, and this doesn't seem like a
Tom> place to start.

I quote your own words from less than a year ago:

The standard actually says that the source for a multi-column assignment
could be any row-valued expression. The implementation used here is
tightly tied to our existing sub-SELECT support and can't handle other
cases; the Bison grammar would have some issues with them too. However,
I don't feel too bad about this since other cases can be converted into
sub-SELECTs. For instance, "SET (a,b,c) = row_valued_function(x)" could
be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".

(from the commit message for 8f889b108)

Tom> (FWIW, the raw-parser changes would be a negligible fraction of
Tom> the work involved to do it like that.)

/*
* Ideally, we'd accept any row-valued a_expr as RHS of a multiple_set_clause.
* However, per SQL spec the row-constructor case must allow DEFAULT as a row
* member, and it's pretty unclear how to do that (unless perhaps we allow
* DEFAULT in any a_expr and let parse analysis sort it out later?). For the
* moment, the planner/executor only support a subquery as a multiassignment
* source anyhow, so we need only accept ctext_row and subqueries here.
*/

(The "punt to parse analysis" solution looks reasonable to me, but I'm
just as in the dark as you were when you wrote that as to any other
possible solution.)

--
Andrew (irc:RhodiumToad)