Re: Writeable CTE patch

Lists: pgsql-hackers
From: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Writeable CTE patch
Date: 2009-11-14 17:26:13
Message-ID: 4AFEE835.5070606@cs.helsinki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Attached is the latest version of this patch.

I altered rewriting a bit (I've brought the problems with the previous
approach up a couple of times before) and this version should have the
expected output in all situations. This patch doesn't allow you to use
INSERT/UPDATE/DELETE as the top level statement, but you can get around
that by putting the desired top-level statement in a new CTE.

Since the last patch I also moved ExecOpenIndices to nodeModifyTable.c
because the top-level executor doesn't know which result relations are
opened for which operations.

One thing which has bothered me a while is that there is no clear option
for commandType when you have a multiple types of statements in a single
Query. In some places it'd help to know that there are multiple
different statements. This is now achieved by having hasWritableCtes
variable in PlannedStmt, but that doesn't help in places where you don't
have access to (or there isn't yet one) PlannedStmt, which has lead me
to think that we could have a CMD_MULTI or a similar value to mark these
Queries. I haven't taken the time to look at this in detail, but it's
something to think about.

Regards,
Marko Tiikkaja

Attachment Content-Type Size
writeablecte.patch text/plain 43.2 KB

From: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-11-15 21:27:15
Message-ID: 4B007233.1030602@cs.helsinki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> Attached is the latest version of this patch.

Here's that same patch in context diff format. Sorry for the noise.

Regards,
Marko Tiikkaja

Attachment Content-Type Size
writeablectev1.patch text/plain 52.0 KB

From: Alex Hunsaker <badalex(at)gmail(dot)com>
To: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-11-17 06:15:05
Message-ID: 34d269d40911162215p53251a9fi8b691aa8c671b5e2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 15, 2009 at 14:27, Marko Tiikkaja
<marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi> wrote:
> I wrote:
>>
>> Attached is the latest version of this patch.

Find attached a incremental diff with the following changes:
-get rid of operation argument to InitResultRelInfo, its unused now
-add some asserts to make sure places we use subplanstate now that it
can be null
(*note* AFAICT its a cant happen, but it made me nervous hence the Asserts)
-remove unneeded plannodes.h includes
-minor whitespace fix

Other comments:
You have an "XXX we should probably update the snapshot a bit
differently". Any plans on that?
Thats quite a bit of new code in ExecutePlan, worth splitting into its
own function?

Also, after reading through the previous threads; it was not
immediately obvious that you dealt with
http://archives.postgresql.org/pgsql-hackers/2009-10/msg00566.php by
only allowing selects or values at the top level of with.

Find below the standard review boilerplate from
http://wiki.postgresql.org/wiki/Reviewing_a_Patch

Summary: looks ready for a commiter to me after above comments are addressed.

Submission review:
*Is the patch in context diff format?
Yes
* Does it apply cleanly to the current CVS HEAD?
Yes, with fuzz
* Does it include reasonable tests, necessary doc patches, etc?
Yes

Usability review:
Read what the patch is supposed to do, and consider:
* Does the patch actually implement that?
Yes
* Do we want that?
Yes
* Do we already have it?
No
* Does it follow SQL spec, or the community-agreed behavior?
Yes
* Does it include pg_dump support (if applicable)?
N/A
* Are there dangers?
No
* Have all the bases been covered?
All the ones I can see

Feature test:
Apply the patch, compile it and test:
* Does the feature work as advertised?
Yes
* Are there corner cases the author has failed to consider?
Not that I could trigger
* Are there any assertion failures or crashes?
No
o Review should be done with the configure options --enable-cassert
and --enable-debug turned on;
Yes

Performance review:
*Does the patch slow down simple tests:
No
*If it claims to improve performance, does it?
N/A
*Does it slow down other things
No

Coding review:
Read the changes to the code in detail and consider:
* Does it follow the project coding guidelines?
Yes
* Are there portability issues?
No
* Will it work on Windows/BSD etc?
Yes
* Are the comments sufficient and accurate?
Yes
* Does it do what it says, correctly?
Yes
* Does it produce compiler warnings?
No
* Can you make it crash?
No

Architecture review:
Consider the changes to the code in the context of the project as a whole:
* Is everything done in a way that fits together coherently with
other features/modules?
I think so.
* Are there interdependencies than can cause problems?
No

Attachment Content-Type Size
wcte_inc.patch text/x-patch 3.7 KB

From: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
To: Alex Hunsaker <badalex(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-11-17 10:54:46
Message-ID: 4B0280F6.7050706@cs.helsinki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alex Hunsaker wrote:
> Find attached a incremental diff with the following changes:
> -get rid of operation argument to InitResultRelInfo, its unused now

Missed that one. Thanks.

> -add some asserts to make sure places we use subplanstate now that it
> can be null
> (*note* AFAICT its a cant happen, but it made me nervous hence the Asserts)

Indeed, it shouldn't happen, but this seems like a decent precaution.

> Other comments:
> You have an "XXX we should probably update the snapshot a bit
> differently". Any plans on that?

I've looked into that, but couldn't find a better way. Maybe I should
take out my scuba gear for a new dive into the snapshot code..

> Thats quite a bit of new code in ExecutePlan, worth splitting into its
> own function?

Could probably be.

> Also, after reading through the previous threads; it was not
> immediately obvious that you dealt with
> http://archives.postgresql.org/pgsql-hackers/2009-10/msg00566.php by
> only allowing selects or values at the top level of with.

This is actually just something missing from the current implementation.
The relevant posts are in the same thread:
http://archives.postgresql.org/pgsql-hackers/2009-10/msg00558.php and
the two follow-ups. The comment in ExecutePlan() is a bit misleading.
What I meant is that we don't call GetCurrentCommandId() in
standard_ExecutorStart(). Instead we get a new CID for every CTE with
INSERT/UPDATE/DELETE. That comment tried to point out the fact that
this strategy could fail if there was a non-SELECT query as the
top-level statement because we wouldn't increment the CID after the last
CTE. I did it this way because it works well for the purposes of this
patch and I didn't see an obvious way to determine whether we need a new
CID for the top-level statement or not.

I'll send an updated patch in a couple of days.

Regards,
Marko Tiikkaja


From: Alex Hunsaker <badalex(at)gmail(dot)com>
To: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-11-17 15:52:45
Message-ID: 34d269d40911170752yaa09dear37ed99dd3f15ff7c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 17, 2009 at 03:54, Marko Tiikkaja
<marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi> wrote:
>> Also, after reading through the previous threads; it was not
>> immediately obvious that you dealt with
>> http://archives.postgresql.org/pgsql-hackers/2009-10/msg00566.php by
>> only allowing selects or values at the top level of with.
>
> This is actually just something missing from the current implementation.
>  The relevant posts are in the same thread:

> http://archives.postgresql.org/pgsql-hackers/2009-10/msg00558.php and
> the two follow-ups.  The comment in ExecutePlan() is a bit misleading.

Hrm I tried the various forms of:
with x as ( ... ) insert/update/delete

and could not get any of them to work. So I assumed the comment
about only SELECT and values were allowed was correct. Maybe a
function that does an insert or update at the top level could get it
to break?

> What I meant is that we don't call GetCurrentCommandId() in
> standard_ExecutorStart().  Instead we get a new CID for every CTE with
> INSERT/UPDATE/DELETE.  That comment tried to point out the fact that
> this strategy could fail if there was a non-SELECT query as the
> top-level statement because we wouldn't increment the CID after the last
> CTE.

Right... Which I thought was more or less the recommendation? Guess
Ill have to go re-read that discussion.

> I did it this way because it works well for the purposes of this
> patch and I didn't see an obvious way to determine whether we need a new
> CID for the top-level statement or not.
>
> I'll send an updated patch in a couple of days.

Peachy.


From: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
To: Alex Hunsaker <badalex(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-11-23 21:33:29
Message-ID: 4B0AFFA9.30108@cs.helsinki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Sorry for the delay, I've been very busy for the last two weeks.
Attached is the latest version of the patch.

The snapshot update code is still the same, I have no good idea what, if
anything, should be done to it. In addition to that, I decided to keep
the code in ExecutePlan() as it was in the last patch. I tried to make
the comments a bit more clear to avoid confusion.

Regards,
Marko Tiikkaja

Attachment Content-Type Size
writeable2.patch text/plain 54.9 KB

From: Alex Hunsaker <badalex(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Fwd: Writeable CTE patch
Date: 2009-11-25 20:30:30
Message-ID: 34d269d40911251230q7b15746aoc9cc9ece7533eea@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Argh hit the wrong reply button...

---------- Forwarded message ----------
From: Alex Hunsaker <badalex(at)gmail(dot)com>
Date: Wed, Nov 25, 2009 at 10:20
Subject: Re: [HACKERS] Writeable CTE patch
To: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>

On Mon, Nov 23, 2009 at 14:33, Marko Tiikkaja
<marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi> wrote:
> Hi,
>
> Sorry for the delay, I've been very busy for the last two weeks.
> Attached is the latest version of the patch.

Heh, sorry about my delay.

> The snapshot update code is still the same, I have no good idea what, if
> anything, should be done to it.

Me neither.

>  In addition to that, I decided to keep
> the code in ExecutePlan() as it was in the last patch.

Fine with me.

I think I've taken this patch about as far as I can take it.  So I'm
going to mark it as ready for commiter.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Cc: Alex Hunsaker <badalex(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-11-26 18:24:32
Message-ID: 4932.1259259872@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi> writes:
> Attached is the latest version of the patch.

I started to look at this patch and soon noted that a very substantial
fraction of the delta had to do with getting rid of dependencies on
estate->es_result_relation_info. It seemed to me that if we were going
to do that, we should try to get rid of the field altogether, so I
looked at what that would take. Unfortunately, there's a problem with
that, which I think proves the patch's approach invalid as well. With
the patch's changes, the only remaining user of es_result_relation_info
is ExecContextForcesOids(), which is only called during plan node
initialization. The patch supposes that it's sufficient to set up
es_result_relation_info while a ModifyTable node is initializing its
child nodes. What this misses is EvalPlanQual, which can require
initialization of a new plan tree during execution. To make it work
we'd need to be sure to set up es_result_relation_info during execution
as well, which pretty much destroys any gain from the proposed
refactoring.

When I realized this, my first thought was that we might as well drop
all the proposed changes that involve avoiding use of
es_result_relation_info. I was wondering though whether you had a
functional reason for getting rid of them, or if it was just trying to
tidy the code a bit?

I did think of a plan B: we could get rid of ExecContextForcesOids()
altogether and let plan nodes always do whatever seems locally most
efficient about OIDs. The consequence of this would be that if we
are doing INSERT or SELECT INTO and the plan tree produces the wrong
has-OIDs state, we would need to insert a junkfilter to fix it, which
would represent processing that would be unnecessary if there was no
other reason to have a junkfilter (ie, no junk columns in the result).
This actually does not affect UPDATE, which always has a junk TID
column so always needs a junkfilter anyway; nor DELETE, which doesn't
need to produce tuples for insertion. At the time we put in
ExecContextForcesOids() it seemed that there was enough of a use-case
to justify klugery to avoid the extra filtering overhead. However,
since OIDs in user tables have been deprecated for several versions
now, I'm thinking that maybe the case doesn't arise often enough to
justify keeping such a wart in the executor.

Comments?

regards, tom lane


From: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alex Hunsaker <badalex(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-11-26 18:39:28
Message-ID: 4B0ECB60.5010609@cs.helsinki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> What this misses is EvalPlanQual, which can require
> initialization of a new plan tree during execution.

Agh. You're right, I missed that.

> When I realized this, my first thought was that we might as well drop
> all the proposed changes that involve avoiding use of
> es_result_relation_info. I was wondering though whether you had a
> functional reason for getting rid of them, or if it was just trying to
> tidy the code a bit?

The latter.

> However,
> since OIDs in user tables have been deprecated for several versions
> now, I'm thinking that maybe the case doesn't arise often enough to
> justify keeping such a wart in the executor.

Under the circumstances I'd lean towards this option.

Regards,
Marko Tiikkaja


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Cc: Alex Hunsaker <badalex(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-11-27 17:42:02
Message-ID: 2888.1259343722@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi> writes:
> Tom Lane wrote:
>> since OIDs in user tables have been deprecated for several versions
>> now, I'm thinking that maybe the case doesn't arise often enough to
>> justify keeping such a wart in the executor.

> Under the circumstances I'd lean towards this option.

I've been fooling around with this further and have gotten as far as
the attached patch. It passes regression tests but suffers from an
additional performance loss: the physical-tlist optimization is disabled
when scanning a relation having OIDs. (That is, we'll always use
ExecProject even if the scan is "SELECT * FROM ...".) I think this loss
is worth worrying about since it would apply to queries on system
catalogs, even if the database has no OIDs in user tables. The trick
is to make the knowledge of the required hasoid state available at
ExecAssignResultType time, so that the plan node's result tupdesc is
constructed correctly.

What seems like the best bet is to merge ExecAssignResultTypeFromTL
and ExecAssignScanProjectionInfo into a single function that should
be used by scan node types. It'll do the determination of whether
a physical-tlist optimization is possible, and then set up both the
output tupdesc and the projection info accordingly. This will make the
patch diff a good bit longer but not much more interesting, so I'm
sending it along at this stage.

I think this is worth doing since it cleans up one of the grottier
parts of executor initialization. The whole thing around
ExecContextForcesOids was never pretty, and it's been the source of
more than one bug if memory serves.

Comments?

regards, tom lane

Attachment Content-Type Size
no_es_result_relation_info.patch text/x-patch 34.1 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Cc: Alex Hunsaker <badalex(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-11-27 19:44:50
Message-ID: 5503.1259351090@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> I think this is worth doing since it cleans up one of the grottier
> parts of executor initialization. The whole thing around
> ExecContextForcesOids was never pretty, and it's been the source of
> more than one bug if memory serves.

On further review there's a really serious stumbling block here.
Consider
INSERT INTO t1 SELECT * FROM t2 UNION ALL SELECT * FROM t3
where the three tables all have the same user columns but t2 has
OIDs and t3 not (or vice versa). Without ExecContextForcesOids
or something very much like it, both scan nodes will think they
can return physical tuples. The output of the Append node will
therefore contain some tuples with OIDs and some without. Append
itself can't fix that since it doesn't project. In many queries
this would not matter --- but if we are inserting them directly
into t1 without any further filtering, it does matter.

I can imagine various ways around this, but it's not clear that
any of them are much less grotty than the code is now. In any
case this was just a marginal code cleanup idea and it doesn't
seem worth spending so much time on right now.

I'm going to go back to plan A: drop the es_result_relation_info
changes from the patch.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Cc: Alex Hunsaker <badalex(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-11-28 18:59:14
Message-ID: 29292.1259434754@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi> writes:
> Attached is the latest version of the patch.

I looked through this patch and concluded that it still needs a fair
amount of work, so I'm bouncing it back for further work.

1. I thought we'd agreed at
http://archives.postgresql.org/pgsql-hackers/2009-10/msg00558.php
that the patch should support WITH on DML statements, eg
with (some-query) insert into foo ...
This might not take much more than grammar additions, but it's
definitely lacking at the moment.

2. The handling of rules on DML WITH queries is far short of sufficient.
AFAICT, what it's doing is rewriting the query, then taking the first
or last element of the resulting query list as replacing the WITH
query, and adding the rest of the list after or before the main query.
This does not work at all for cases involving conditional DO INSTEAD
rules, since there could be more than one element of the resulting
query list that's responsible for delivering results depending on the
runtime outcome of the condition. I don't think it works for
unconditional DO INSTEAD either, since the rule producing output might
not be the first or last one. And in any case it fails to satisfy the
POLA in regards to the order of execution of DO ALSO queries relative
to other WITH queries or the main query.

I am not sure that it is possible to fix this without really drastic
surgery on the rule mechanisms. Or maybe we ought to rethink what
the representation of DML WITH queries is.

Perhaps it would be acceptable to just throw ERROR_FEATURE_NOT_SUPPORTED
when there are DO ALSO or conditional DO INSTEAD rules applying to the
target of a DML WITH query. I wouldn't normally think that just blowing
off such a thing meets the project's quality standards, but we all know
that the current rule mechanism is in need of a ground-up redesign anyway.
It's hard to justify putting a lot of work into making it work with DML
WITH queries when we might be throwing it all out in the future.

One thing that really does have to draw an error is that AFAIR the current
rule feature doesn't enforce that a rewritten query produce the same type
of output that the original would have. We just ship off whatever the
results are to the client, and let it sort everything out. In a DML WITH
query, though, I think we do have to insist that the rewritten query(s)
still produce the same RETURNING rowtype as before.

3. I'm pretty unimpressed with the code added to ExecutePlan. It knows
way more than it ought to about CTEs, and yet I don't think it's doing the
right things anyway --- in particular, won't it run the "leader" CTE more
than once if one CTE references another? I think it would be better if
the PlannedStmt representation just told ExecutePlan what to do, rather
than having all these heuristics inside ExecutePlan. (BTW, I also think
it would work better if you had the CommandCounterIncrement at the bottom
of the loop, after the subquery execution not before it. But I'm not sure
it's safe for ExecutePlan to be modifying the snapshot it's handed anyway.)

I wonder whether it would be practical to fix both #2 and #3 by having the
representation of DML WITH queries look more like the representation of
rule rewrite output --- that is, generate a list of top-level Queries
not one Query with DML subqueries in its CTE list. The main thing that
seems to be missing in order to allow that is for a Query to refer back to
the output of a previous Query in the list. This doesn't seem
tremendously hard at runtime --- it's just a tuplestore to keep around
--- but I'm not clear what it ought to look like in terms of the parsetree
representation.

4. As previously noted, the changes to avoid using es_result_relation_info
are broken and need to be dropped from the patch.

regards, tom lane


From: Alex Hunsaker <badalex(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-11-29 22:38:50
Message-ID: 34d269d40911291438p65bc85deg280a6a508b3e65bc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Nov 28, 2009 at 11:59, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> 1. I thought we'd agreed at
> http://archives.postgresql.org/pgsql-hackers/2009-10/msg00558.php
> that the patch should support WITH on DML statements, eg
>        with (some-query) insert into foo ...
> This might not take much more than grammar additions, but it's
> definitely lacking at the moment.

Hrm ? A few messages down you say SELECT should be a good start

http://archives.postgresql.org/pgsql-hackers/2009-10/msg01081.php

> 2. The handling of rules on DML WITH queries is far short of sufficient.

Ick.

> Perhaps it would be acceptable to just throw ERROR_FEATURE_NOT_SUPPORTED
> when there are DO ALSO or conditional DO INSTEAD rules applying to the
> target of a DML WITH query.

+1

> 3. I'm pretty unimpressed with the code added to ExecutePlan.
> I wonder whether it would be practical to fix both #2 and #3 by having the
> representation of DML WITH queries look more like the representation of
> rule rewrite output

Interesting... This seems like the best solution ( assuming its
workable ). It also looks like it might make #1 easier as well.

However, I think the current approach does have some virtue in that I
was surprised how little the patch was. Granted that is partly due to
ExecutePlan knowing to much.


From: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alex Hunsaker <badalex(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-11-30 18:43:51
Message-ID: 4B141267.2030107@cs.helsinki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> 1. I thought we'd agreed at
> http://archives.postgresql.org/pgsql-hackers/2009-10/msg00558.php
> that the patch should support WITH on DML statements, eg
> with (some-query) insert into foo ...
> This might not take much more than grammar additions, but it's
> definitely lacking at the moment.

Ok, I added these.

> One thing that really does have to draw an error is that AFAIR the current
> rule feature doesn't enforce that a rewritten query produce the same type
> of output that the original would have. We just ship off whatever the
> results are to the client, and let it sort everything out. In a DML WITH
> query, though, I think we do have to insist that the rewritten query(s)
> still produce the same RETURNING rowtype as before.

Agreed.

> 3. I'm pretty unimpressed with the code added to ExecutePlan. It knows
> way more than it ought to about CTEs, and yet I don't think it's doing the
> right things anyway --- in particular, won't it run the "leader" CTE more
> than once if one CTE references another?

Yes. Are you suggesting something more intelligent to avoid scanning
the CTE more than once or..?

> I think it would be better if
> the PlannedStmt representation just told ExecutePlan what to do, rather
> than having all these heuristics inside ExecutePlan.

Yup, seems like a better choice.

> (BTW, I also think
> it would work better if you had the CommandCounterIncrement at the bottom
> of the loop, after the subquery execution not before it. But I'm not sure
> it's safe for ExecutePlan to be modifying the snapshot it's handed anyway.)

Agreed. I'm a bit lost here with the snapshot business; is doing this
work in ExecutePlan() out of the question or is it just that what I'm
doing is wrong?

> 4. As previously noted, the changes to avoid using es_result_relation_info
> are broken and need to be dropped from the patch.

Done. I kept the logic for result relations to allow nested ModifyTable
nodes, but I don't think it ever did the right thing with EvalPlanQual()
and nested nodes. I'll have think about that.

Regards,
Marko Tiikkaja


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Cc: Alex Hunsaker <badalex(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-11-30 18:56:40
Message-ID: 8067.1259607400@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi> writes:
> Tom Lane wrote:
>> (BTW, I also think
>> it would work better if you had the CommandCounterIncrement at the bottom
>> of the loop, after the subquery execution not before it. But I'm not sure
>> it's safe for ExecutePlan to be modifying the snapshot it's handed anyway.)

> Agreed. I'm a bit lost here with the snapshot business; is doing this
> work in ExecutePlan() out of the question or is it just that what I'm
> doing is wrong?

I think it's not a good idea for ExecutePlan to be scribbling on the
executor's input, and the provided snapshot is definitely an input.
It might accidentally fail to fail in the present system, but it would
always be a hazard. The only thing that I'd be comfortable with is
copying the snap and modifying the copy. This might be okay from a
performance standpoint if it's done at the bottom of the loop (ie,
only when you actually have at least one writable CTE). It would be
altogether cleaner though if the CommandCounterIncrement responsibility
were in the same place it is now, ie the caller of the executor. Which
could be possible if we restructure the rewriter/planner output as a
list of Queries instead of just one. I'm not currently sure how hard
that would be, though; it might not be a practical answer.

regards, tom lane


From: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alex Hunsaker <badalex(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-11-30 19:00:13
Message-ID: 4B14163D.7090104@cs.helsinki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> It would be
> altogether cleaner though if the CommandCounterIncrement responsibility
> were in the same place it is now, ie the caller of the executor. Which
> could be possible if we restructure the rewriter/planner output as a
> list of Queries instead of just one. I'm not currently sure how hard
> that would be, though; it might not be a practical answer.

I'm trying to avoid doing this, at least for now.

Regards,
Marko Tiikkaja


From: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alex Hunsaker <badalex(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-12-07 14:46:11
Message-ID: 4B1D1533.7080804@cs.helsinki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> The only thing that I'd be comfortable with is
> copying the snap and modifying the copy.

I don't see an easy way to do that with the current code; CopySnapshot()
is static and PushUpdatedSnapshot() seems to be a bit of a pain since it
messes up some of the existing code which uses the active snapshot
stack. Any ideas?

Regards,
Marko Tiikkaja


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alex Hunsaker <badalex(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-12-07 22:06:35
Message-ID: 20091207220635.GP3552@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Marko Tiikkaja escribió:
> Tom Lane wrote:
> >The only thing that I'd be comfortable with is
> >copying the snap and modifying the copy.
>
> I don't see an easy way to do that with the current code;
> CopySnapshot() is static and PushUpdatedSnapshot() seems to be a bit
> of a pain since it messes up some of the existing code which uses
> the active snapshot stack. Any ideas?

That API is rather new. Maybe we need a new entry point, say
GetActiveSnapshotCopy or some such.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support