Re: UPDATE using sub selects

Lists: pgsql-patches
From: NikhilS <nikkhils(at)gmail(dot)com>
To: pgsql-patches(at)postgresql(dot)org
Subject: UPDATE using sub selects
Date: 2007-04-11 08:26:22
Message-ID: d3c4af540704110126o5b6c9c79na40ae4d4c6236d06@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Hi,

As per discussion on -hackers, a patch which allows updates to use
subselects is attached with this mail.

As per discussion with Tom, I have adopted the following approach:

* Introduce ROWEXPR_SUBLINK type for subqueries that allows multiple column
outputs.
* Populate the targetList with PARAM_SUBLINK entries dependent on the
subselects.
* Modify the targets in-place into PARAM_EXEC entries in the make_subplan
phase.

The above does not require any kluges in the targetList processing code path
at all.

UPDATEs seem to work fine using subselects with this patch. I have modified
the update.sql regression test to include possible variations .

No documentation changes are present in this patch.
Feedback, comments appreciated.

Regards,
Nikhils
--
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
updates_using_subselects.patch text/x-patch 37.2 KB

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: NikhilS <nikkhils(at)gmail(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: UPDATE using sub selects
Date: 2007-04-27 01:06:28
Message-ID: 200704270106.l3R16Sa21032@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches


This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

NikhilS wrote:
> Hi,
>
> As per discussion on -hackers, a patch which allows updates to use
> subselects is attached with this mail.
>
> As per discussion with Tom, I have adopted the following approach:
>
> * Introduce ROWEXPR_SUBLINK type for subqueries that allows multiple column
> outputs.
> * Populate the targetList with PARAM_SUBLINK entries dependent on the
> subselects.
> * Modify the targets in-place into PARAM_EXEC entries in the make_subplan
> phase.
>
> The above does not require any kluges in the targetList processing code path
> at all.
>
> UPDATEs seem to work fine using subselects with this patch. I have modified
> the update.sql regression test to include possible variations .
>
> No documentation changes are present in this patch.
> Feedback, comments appreciated.
>
> Regards,
> Nikhils
> --
> EnterpriseDB http://www.enterprisedb.com

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: NikhilS <nikkhils(at)gmail(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: UPDATE using sub selects
Date: 2008-03-17 20:48:25
Message-ID: 26399.1205786905@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

NikhilS <nikkhils(at)gmail(dot)com> writes:
> As per discussion on -hackers, a patch which allows updates to use
> subselects is attached with this mail.

I looked over this patch briefly, and I'm afraid it still needs a lot of
work.

The way you altered transformUpdateStmt is a mess; it's duplicating
logic and also using inapplicable tests. (In particular, invoking
checkInsertTargets on a subset of the complete target list is just not
sensible --- it's supposed to be catching conflicts. I think you really
don't need it at all anyway, because the needed tests are done
downstream in the rewriter.) I think the right way is to expand out the
row-assignment portions of the SET target list and then process the
whole SET list the same as before.

The real problem though is that the patch creates a new storage method
and processing path for ROWEXPR_SUBLINK sublinks that's got nothing to
do with the other kinds of sublinks. This is going to be a mess and a
fertile source of bugs. (You already have quite a few stemming from
having overlooked all the places that have to be touched to add a field
to Query, eg backend/nodes/, optimizer/util/clauses.c, ruleutils.c...)

The basic idea is not bad, but I think if we're going to make a list of
subqueries to attach to Query, we should try to do all the forms of
SubLink that way, not just one. From a logical point of view all the
single-result-row forms of SubLink are about the same, and we should
strive to unify them not make them even more different. Also there are
a lot of kluges that could be got rid of if subqueries were pulled out
of expression trees leaving only Param references behind. We'd need
something a bit more general than Param, perhaps, depending on how we
wanted to distinguish output values from different subqueries. Right
now we don't have to worry about that at parse time, but can leave it to
the planner to assign globally unique Param IDs; but that work would
have to be pushed further upstream. (BTW, I'm fairly certain your patch
won't work for multiple subselects in one UPDATE, because it fails to
deal with this point. The test cases in the patch do NOT exercise the
case because you don't have multiple multiple-output subselects in any
of them.)

I thought about trying to fix this stuff but soon concluded that it
was more work than I could justify spending on one patch during
commit fest. It has to be bounced back for now.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: NikhilS <nikkhils(at)gmail(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: UPDATE using sub selects
Date: 2008-03-17 22:40:49
Message-ID: 407.1205793649@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

I wrote:
> ... eg backend/nodes/, optimizer/util/clauses.c, ruleutils.c...)

Actually, on further thought ruleutils.c represents one of the biggest
stumbling blocks here. We have to be able to reverse-compile the parse
tree into something that's at least semantically equivalent to the
original query. The existing kluge for UPDATE SET (columns) = ... can
ignore this because it rearranges the query into a sematically
equivalent update with independent SET targets, but the whole problem
with sub-select updates is that there *is* no semantically equivalent
command with a flat targetlist. So one way or another there will have
to be more information in the parse tree than there is now.

If we use Params that can be traced back to specific SubLink list
entries, it might be okay to finesse this by having ruleutils.c check
for successive targetlist entries that reference outputs of the same
SubLink. That's pretty ugly but it might be better than changing the
representation of targetlists, which is something that's understood
by one heck of a lot of code.

regards, tom lane


From: NikhilS <nikkhils(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: UPDATE using sub selects
Date: 2008-03-18 07:48:19
Message-ID: d3c4af540803180048o291124b1n863e454b188c96a3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Hi Tom,

> ... eg backend/nodes/, optimizer/util/clauses.c, ruleutils.c...)
>
> Actually, on further thought ruleutils.c represents one of the biggest
> stumbling blocks here. We have to be able to reverse-compile the parse
> tree into something that's at least semantically equivalent to the
> original query. The existing kluge for UPDATE SET (columns) = ... can
> ignore this because it rearranges the query into a sematically
> equivalent update with independent SET targets, but the whole problem
> with sub-select updates is that there *is* no semantically equivalent
> command with a flat targetlist. So one way or another there will have
> to be more information in the parse tree than there is now.
>
> If we use Params that can be traced back to specific SubLink list
> entries, it might be okay to finesse this by having ruleutils.c check
> for successive targetlist entries that reference outputs of the same
> SubLink. That's pretty ugly but it might be better than changing the
> representation of targetlists, which is something that's understood
> by one heck of a lot of code.
>

Thanks a lot for taking a look at this patch. As mentioned and detailed out
by you, this patch *does* need a lot more amount of work and is certainly
not a simpler effort that I had envisioned it to be earlier. Another issue
with the patch as it stands today is that it does not work with correlated
subqueries. This will require some more work too. So for example, something
of this sort does not work yet:
UPDATE update_tbl set (a, b) = (select a, b from other_tbl where c =
update_tbl.c) where a = 10;

I will try to have another shot at it if I can, before the next commit fest.

Thanks and Regards,
Nikhils

--
EnterpriseDB http://www.enterprisedb.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: NikhilS <nikkhils(at)gmail(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: UPDATE using sub selects
Date: 2008-03-18 14:58:05
Message-ID: 200803181458.m2IEw5419179@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches


Patch removed from patch queue --- NikhilS, please resubmit when you are
ready. Thanks.

---------------------------------------------------------------------------

NikhilS wrote:
> Hi,
>
> As per discussion on -hackers, a patch which allows updates to use
> subselects is attached with this mail.
>
> As per discussion with Tom, I have adopted the following approach:
>
> * Introduce ROWEXPR_SUBLINK type for subqueries that allows multiple column
> outputs.
> * Populate the targetList with PARAM_SUBLINK entries dependent on the
> subselects.
> * Modify the targets in-place into PARAM_EXEC entries in the make_subplan
> phase.
>
> The above does not require any kluges in the targetList processing code path
> at all.
>
> UPDATEs seem to work fine using subselects with this patch. I have modified
> the update.sql regression test to include possible variations .
>
> No documentation changes are present in this patch.
> Feedback, comments appreciated.
>
> Regards,
> Nikhils
> --
> EnterpriseDB http://www.enterprisedb.com

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +