Quick Links

Re: V3 protocol vs INSERT/UPDATE RETURNING

Lists:	pgsql-hackers

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)postgreSQL(dot)org
Subject:	V3 protocol vs INSERT/UPDATE RETURNING
Date:	2006-08-11 16:36:26
Message-ID:	16022.1155314186@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I'm looking at modifying the portal execution strategy so that INSERT
RETURNING and friends will work when invoked through "extended query"
protocol. Currently, INSERT/UPDATE/DELETE queries are always executed
under PORTAL_MULTI_QUERY strategy, which runs the portal's queries to
completion and discards any results. Obviously that's gotta change.
I was considering making a PORTAL_ONE_RETURNING strategy that works
just about like PORTAL_ONE_SELECT, but that would have an interesting
side effect. In PORTAL_ONE_SELECT, we can execute the query
incrementally (if the client sends Execute messages with row limit
counts specified), and we don't insist that the client send enough
Execute messages to run the query to completion. If applied to a
RETURNING query this would mean that a multi-row update might not
be executed completely.

I can think of a number of possible ways to handle this:

1. Define it as a feature not a bug. People do occasionally ask for
"UPDATE foo ... LIMIT 1" after all. But this is a pretty klugy way of
getting that, and the arguments that say allowing LIMIT on updating
queries would be a bad idea haven't lost their force.

2. Ignore any requested Execute limit in PORTAL_ONE_RETURNING mode.
Trivial to implement but violates the protocol specification.

3. Throw an error (thereby rolling back the incomplete update)
if client closes the portal without having run it to completion.

4. Treat PORTAL_ONE_RETURNING like PORTAL_UTIL_SELECT rather than
like PORTAL_ONE_SELECT; that is, execute the query to completion
on first call and stash the results in a tuplestore until the
client fetches them.

I don't like #1 much, #2 and #3 seem klugy as well, but #4 is pretty
inefficient. At the moment I'm thinking #3 is the least bad answer,
but does anyone have another idea?

regards, tom lane

From:	"Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: V3 protocol vs INSERT/UPDATE RETURNING
Date:	2006-08-11 16:48:18
Message-ID:	36e682920608110948m676effaeof3c2153017b4dc90@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Sorry, copied to list.

On 8/11/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> 3. Throw an error (thereby rolling back the incomplete update)
> if client closes the portal without having run it to completion.

Sounds like the most reasonable considering. I'm not averse to it.

> 4. Treat PORTAL_ONE_RETURNING like PORTAL_UTIL_SELECT rather than
> like PORTAL_ONE_SELECT; that is, execute the query to completion
> on first call and stash the results in a tuplestore until the
> client fetches them.

I agree that it's inefficient, but am trying to think of any other
positive reasons for doing #4 instead. Can you think of any other
advantages system-wide to using #4 instead of #3?

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation | fax: 732.331.1301
33 Wood Ave S, 2nd Floor | jharris(at)enterprisedb(dot)com
Iselin, New Jersey 08830 | http://www.enterprisedb.com/

From:	Csaba Nagy <nagy(at)ecircle-ag(dot)com>
To:	postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: V3 protocol vs INSERT/UPDATE RETURNING
Date:	2006-08-11 16:50:26
Message-ID:	1155315025.21451.306.camel@coppola.muc.ecircle.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> 1. Define it as a feature not a bug. People do occasionally ask for
> "UPDATE foo ... LIMIT 1" after all. But this is a pretty klugy way of
> getting that, and the arguments that say allowing LIMIT on updating
> queries would be a bad idea haven't lost their force.

Being one of those who was asking for an UPDATE/DELETE with limit, I
would be very glad if this would be implemented... it would be a big
help for batch-processing data in OLTP environment (no long running
queries allowed). I still don't see why would nondeterminism be
generally a bad thing when there are applications which don't care about
that...

Cheers,
Csaba.

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: V3 protocol vs INSERT/UPDATE RETURNING
Date:	2006-08-11 17:09:21
Message-ID:	16229.1155316161@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Jonah H. Harris" <jonah(dot)harris(at)gmail(dot)com> writes:
> On 8/11/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> 4. Treat PORTAL_ONE_RETURNING like PORTAL_UTIL_SELECT rather than
>> like PORTAL_ONE_SELECT; that is, execute the query to completion
>> on first call and stash the results in a tuplestore until the
>> client fetches them.

> I agree that it's inefficient, but am trying to think of any other
> positive reasons for doing #4 instead. Can you think of any other
> advantages system-wide to using #4 instead of #3?

Not really. Locks and so forth held by the query would be held till
commit in any case, so I don't see much advantage in finishing the
query immediately.

regards, tom lane

From:	"Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: V3 protocol vs INSERT/UPDATE RETURNING
Date:	2006-08-11 18:09:51
Message-ID:	36e682920608111109y6e4117a8l2d00bfc14b418fbd@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 8/11/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Not really. Locks and so forth held by the query would be held till
> commit in any case, so I don't see much advantage in finishing the
> query immediately.

Very true.

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: V3 protocol vs INSERT/UPDATE RETURNING
Date:	2006-08-11 20:14:06
Message-ID:	23253.1155327246@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> I agree that it's inefficient, but am trying to think of any other
> positive reasons for doing #4 instead.

I found a showstopper reason why it has to be done this way: the AFTER
TRIGGER code isn't capable of dealing with interleaved execution of
different queries (it can basically only track nested queries).
Possibly that could be improved in the future, but for 8.2 I think
we're stuck with using a tuplestore.

One optimization I think might not be too hard is to bypass the
tuplestore and stream RETURNING tuples directly to the client if the
first Execute for the portal doesn't give a row limit (which is surely
the typical case). I don't plan to do that in the first cut though.

regards, tom lane