Re: UPDATE using sub selects

Lists: pgsql-hackers
From: NikhilS <nikkhils(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: UPDATE using sub selects
Date: 2007-03-15 06:06:43
Message-ID: d3c4af540703142306t4d838964ob317fd66df93c803@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I have coded up a patch which solves the following TODO. I will submit a
patch for this soon:

- UPDATE
- Allow UPDATE tab SET ROW (col, ...) = (SELECT...)

http://archives.postgresql.org/pgsql-hackers/2006-07/msg01306.php

The question is that since this enhances the UPDATE syntax, what changes and
where all they need to be made with respect to the documentation?

Regards,
Nikhils
--
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: NikhilS <nikkhils(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: UPDATE using sub selects
Date: 2007-03-15 06:32:55
Message-ID: 29551.1173940375@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

NikhilS <nikkhils(at)gmail(dot)com> writes:
> I have coded up a patch which solves the following TODO. I will submit a
> patch for this soon:
> http://archives.postgresql.org/pgsql-hackers/2006-07/msg01306.php

Cool...

> The question is that since this enhances the UPDATE syntax, what changes and
> where all they need to be made with respect to the documentation?

Documentation is the very least of your worries. What exactly is your
implementation plan? If this were a simple or narrow fix it would
have been done already.

regards, tom lane


From: NikhilS <nikkhils(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: UPDATE using sub selects
Date: 2007-03-15 06:52:23
Message-ID: d3c4af540703142352t2476efd8ybdc8c373ef672c78@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

>
> > The question is that since this enhances the UPDATE syntax, what changes
> and
> > where all they need to be made with respect to the documentation?
>
> Documentation is the very least of your worries. What exactly is your
> implementation plan? If this were a simple or narrow fix it would
> have been done already.

The implementation that I have planned is pretty similar to the way "INSERT
INTO ... SELECT" has been implemented.

Along with the grammar changes in gram.y, the changes are localized in the
transformUpdateStmt code path. The SELECT clause ends up becoming a subquery
to the update query with the target column expressions transformed properly
to include the subquery expressions. Does this sound ok?

I have tried some update-subselect variations and they seem to work. For
example the case in the src/test/regress/sql/update.sql, which used to fail
till now, seems to work:

UPDATE update_test SET (a,b) = (select a,b FROM update_test where c = 'foo')
WHERE a = 10;

Will try testing out some other variations too.

Regards,
Nikhils
--
EnterpriseDB http://www.enterprisedb.com


From: Gaetano Mendola <mendola(at)bigfoot(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc: NikhilS <nikkhils(at)gmail(dot)com>
Subject: Re: UPDATE using sub selects
Date: 2007-03-15 14:47:20
Message-ID: 45F95C78.1040307@bigfoot.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

NikhilS wrote:

> I have tried some update-subselect variations and they seem to work. For
> example the case in the src/test/regress/sql/update.sql, which used to
> fail till now, seems to work:
>
> UPDATE update_test SET (a,b) = (select a,b FROM update_test where c =
> 'foo')
> WHERE a = 10;

What's the expected result if the tuple from subselect is more than 1?
I expect no update at all in case of void result set, is this the case ?

Regards
Gaetano Mendola
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF+Vx47UpzwH2SGd4RAvyVAKCGK5pC9B/lmrNjAFPGXhm5ialwSwCglM2n
DxrxWyvJASX5WSF9B8cAMas=
=AoVF
-----END PGP SIGNATURE-----


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gaetano Mendola <mendola(at)bigfoot(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, NikhilS <nikkhils(at)gmail(dot)com>
Subject: Re: UPDATE using sub selects
Date: 2007-03-16 02:36:38
Message-ID: 29844.1174012598@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gaetano Mendola <mendola(at)bigfoot(dot)com> writes:
> NikhilS wrote:
>> I have tried some update-subselect variations and they seem to work. For
>> example the case in the src/test/regress/sql/update.sql, which used to
>> fail till now, seems to work:
>>
>> UPDATE update_test SET (a,b) = (select a,b FROM update_test where c =
>> 'foo')
>> WHERE a = 10;

> What's the expected result if the tuple from subselect is more than 1?

Error, per SQL99 section 7.14:

1) If the cardinality of a <row subquery> is greater than 1 (one),
then an exception condition is raised: cardinality violation.

> I expect no update at all in case of void result set, is this the case ?

No, you get nulls; it's a subquery not a join. Per SQL99 7.1:

c) If the <row value constructor> is a <row subquery>, then:

i) Let R be the result of the <row subquery> and let D be the
degree of R.

ii) If the cardinality of R is 0 (zero), then the result of the
<row value constructor> is D null values.

iii) If the cardinality of R is 1 (one), then the result of the
<row value constructor> is R.

regards, tom lane


From: NikhilS <nikkhils(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Gaetano Mendola" <mendola(at)bigfoot(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UPDATE using sub selects
Date: 2007-03-16 07:44:25
Message-ID: d3c4af540703160044r576e378ey5bc1eb91924e0de4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

>
> > What's the expected result if the tuple from subselect is more than 1?
>
> Error, per SQL99 section 7.14:
>
> 1) If the cardinality of a <row subquery> is greater than 1
> (one),
> then an exception condition is raised: cardinality violation.

> I expect no update at all in case of void result set, is this the case ?
>
> No, you get nulls; it's a subquery not a join. Per SQL99 7.1:
>
> c) If the <row value constructor> is a <row subquery>, then:
>
> i) Let R be the result of the <row subquery> and let D be
> the
> degree of R.
>
> ii) If the cardinality of R is 0 (zero), then the result of
> the
> <row value constructor> is D null values.
>
> iii) If the cardinality of R is 1 (one), then the result of
> the
> <row value constructor> is R.
>
> regards, tom lane
>

To allow both of the above to hold, I think the subselect will have to be
treated like a EXPR_SUBLINK subquery. I was wondering if we have a similar
mechanism for plain selects/subselects to check and restrict their output to
a single row.

Regards,
Nikhils
--
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: NikhilS <nikkhils(at)gmail(dot)com>
Cc: "Gaetano Mendola" <mendola(at)bigfoot(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UPDATE using sub selects
Date: 2007-03-16 14:37:34
Message-ID: 6371.1174055854@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

NikhilS <nikkhils(at)gmail(dot)com> writes:
> To allow both of the above to hold, I think the subselect will have to be
> treated like a EXPR_SUBLINK subquery. I was wondering if we have a similar
> mechanism for plain selects/subselects to check and restrict their output to
> a single row.

No. Offhand I think you'd either need to relax EXPR_SUBLINK to allow
multiple output columns, or invent a ROW_SUBLINK SubLinkType that is
just like EXPR_SUBLINK except for allowing multiple output columns.
The latter would probably be less likely to break other things...

regards, tom lane


From: NikhilS <nikkhils(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Gaetano Mendola" <mendola(at)bigfoot(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UPDATE using sub selects
Date: 2007-03-16 14:55:16
Message-ID: d3c4af540703160755y32f9cbc4we8f0b34aa98a141e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 3/16/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> NikhilS <nikkhils(at)gmail(dot)com> writes:
> > To allow both of the above to hold, I think the subselect will have to
> be
> > treated like a EXPR_SUBLINK subquery. I was wondering if we have a
> similar
> > mechanism for plain selects/subselects to check and restrict their
> output to
> > a single row.
>
> No. Offhand I think you'd either need to relax EXPR_SUBLINK to allow
> multiple output columns, or invent a ROW_SUBLINK SubLinkType that is
> just like EXPR_SUBLINK except for allowing multiple output columns.
> The latter would probably be less likely to break other things...

Yeah, was looking at EXPR_SUBLINK and its single column use case and drove
to the same conclusion that inventing a new sublink type would be better
too. It is indeed becoming a "not so simple and narrow fix" as you had
mentioned earlier in your first response :)

Regards,
Nikhils
--
EnterpriseDB http://www.enterprisedb.com


From: NikhilS <nikkhils(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Gaetano Mendola" <mendola(at)bigfoot(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UPDATE using sub selects
Date: 2007-03-30 11:26:30
Message-ID: d3c4af540703300426v6e0fa69fl4dcd39d09fd74006@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

No. Offhand I think you'd either need to relax EXPR_SUBLINK to allow
> > multiple output columns, or invent a ROW_SUBLINK SubLinkType that is
> > just like EXPR_SUBLINK except for allowing multiple output columns.
> > The latter would probably be less likely to break other things...
> >
> Yeah, was looking at EXPR_SUBLINK and its single column use case and drove
> to the same conclusion that inventing a new sublink type would be better
> too. It is indeed becoming a "not so simple and narrow fix" as you had
> mentioned earlier in your first response :)

I have invented a ROWEXPR_SUBLINK type that handles multiple output columns.
The trouble is that since eventually columns of the parents have to be part
of the query's targetList, I am sending the entire subquery as one of the
entries in that list and the targetList gets populated with entries
dependent on the subquery much later via make_subplan.

This breaks code in rewriteTargetList (which expects every list entry to be
of type TargetEntry), and expand_targetlist (which expects the targets to be
present in attrno order, the entries added because of the subquery will not
be in order as compared to normal "SET colname = expr" targets).

Is there a simpler way of doing things? Should I try generating a resjunk
TargetEntry in transformUpdateStmt and have its expr point to the subquery
and see if that works?

Regards,
Nikhils
--
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: NikhilS <nikkhils(at)gmail(dot)com>
Cc: "Gaetano Mendola" <mendola(at)bigfoot(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UPDATE using sub selects
Date: 2007-03-30 20:46:12
Message-ID: 23354.1175287572@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

NikhilS <nikkhils(at)gmail(dot)com> writes:
> I have invented a ROWEXPR_SUBLINK type that handles multiple output columns.
> The trouble is that since eventually columns of the parents have to be part
> of the query's targetList, I am sending the entire subquery as one of the
> entries in that list and the targetList gets populated with entries
> dependent on the subquery much later via make_subplan.

> This breaks code in rewriteTargetList (which expects every list entry to be
> of type TargetEntry), and expand_targetlist (which expects the targets to be
> present in attrno order, the entries added because of the subquery will not
> be in order as compared to normal "SET colname = expr" targets).

Hmm. That sounds like it would be a horrid mess. You need to decouple
the execution of the subplan from the use of its outputs, apparently.
There is some precedent for this in the way that InitPlans are handled:
the result of the subplan is stored into a ParamList array entry that's
later referenced by a Param node in the parent's expression tree. That
would generalize easily enough to setting more than one Param, but I'm
not clear on where you'd want to stick the subplan itself in the plan
tree, nor on what controls how often it needs to get evaluated.

An alternative approach is to put the subplan into the rangetable and
use Vars to reference its outputs. Again it's not quite clear what
drives re-execution of the subplan. It strikes me though that an
approach like this might also serve for SQL2003's LATERAL construct,
which'd be a nice thing to support.

regards, tom lane


From: NikhilS <nikkhils(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Gaetano Mendola" <mendola(at)bigfoot(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UPDATE using sub selects
Date: 2007-03-31 05:14:53
Message-ID: d3c4af540703302214y49dfbac4k1c5d8eec1e71b1a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 3/31/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> NikhilS <nikkhils(at)gmail(dot)com> writes:
> > I have invented a ROWEXPR_SUBLINK type that handles multiple output
> columns.
> > The trouble is that since eventually columns of the parents have to be
> part
> > of the query's targetList, I am sending the entire subquery as one of
> the
> > entries in that list and the targetList gets populated with entries
> > dependent on the subquery much later via make_subplan.
>
> > This breaks code in rewriteTargetList (which expects every list entry to
> be
> > of type TargetEntry), and expand_targetlist (which expects the targets
> to be
> > present in attrno order, the entries added because of the subquery will
> not
> > be in order as compared to normal "SET colname = expr" targets).
>
> Hmm. That sounds like it would be a horrid mess. You need to decouple
> the execution of the subplan from the use of its outputs, apparently.
> There is some precedent for this in the way that InitPlans are handled:
> the result of the subplan is stored into a ParamList array entry that's
> later referenced by a Param node in the parent's expression tree. That
> would generalize easily enough to setting more than one Param, but I'm
> not clear on where you'd want to stick the subplan itself in the plan
> tree, nor on what controls how often it needs to get evaluated.

Yes, I have tried this already. As you suspect, it seems that the subplan
does not get evaluated if its not part of the targetList at all.

An alternative approach is to put the subplan into the rangetable and
> use Vars to reference its outputs. Again it's not quite clear what
> drives re-execution of the subplan. It strikes me though that an
> approach like this might also serve for SQL2003's LATERAL construct,
> which'd be a nice thing to support.

Ok, I will try this out.

Regards,
Nikhils

--
EnterpriseDB http://www.enterprisedb.com


From: NikhilS <nikkhils(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Gaetano Mendola" <mendola(at)bigfoot(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UPDATE using sub selects
Date: 2007-04-11 09:04:46
Message-ID: d3c4af540704110204o3d67bad8l6a837d1dda111f1f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

> Hmm. That sounds like it would be a horrid mess. You need to decouple
> > the execution of the subplan from the use of its outputs, apparently.
> > There is some precedent for this in the way that InitPlans are handled:
> > the result of the subplan is stored into a ParamList array entry that's
> > later referenced by a Param node in the parent's expression tree. That
> > would generalize easily enough to setting more than one Param, but I'm
> > not clear on where you'd want to stick the subplan itself in the plan
> > tree, nor on what controls how often it needs to get evaluated.
>
>
>
>

Ended up using something similar to the above suggestion. I have posted the
patch to -patches based on this.

An important concern was where to stick the evaluation of the subqueries so
that they end up becoming subplans which are used in the execution. For this
I have added a new field in the "Query" structure. This entry gets
preprocessed similar to other fields of the Query from within
subquery_planner.

Regards,
Nikhils
--
EnterpriseDB http://www.enterprisedb.com