Re: BUG #8448: looping through query results exits at 10th step under some conditions

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: laszlo(dot)rozsahegyi(at)rool(dot)hu
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #8448: looping through query results exits at 10th step under some conditions
Date: 2013-11-12 19:40:10
Message-ID: 11851.1384285210@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

laszlo(dot)rozsahegyi(at)rool(dot)hu writes:
> Looping through query results exits at 10th step when
> * query has for update clause, and
> * in loop body between 1 and 10 step update - at least one step - the locked
> record

I looked into this a bit. The key point is that the function has
a FOR loop over a SELECT FOR UPDATE, wherein the SELECT FOR UPDATE
would return the same row of table "test" multiple times (because
it's cross-joined to a generate_series() call). But the body of
the FOR loop updates the just-returned row of "test".

Now what should happen, according to the definition of SELECT FOR
UPDATE, is that once a later command in the same transaction has
updated a given row, SELECT FOR UPDATE shouldn't return that row
anymore. (See the HeapTupleSelfUpdated case in nodeLockRows.c.)
So what ought to happen here is that after the UPDATE, the FOR
loop finds no more records and falls out. (This may not be what
the submitter expected, but it's what should happen.)

However ... plpgsql FOR-over-query loops like to prefetch rows.
In this case, the FOR prefetches ten rows at startup and then
runs the loop body. It will then proceed to iterate over the
next nine rows, even though those would not have been returned
anymore by the underlying SELECT FOR UPDATE.

So basically, the bug here is that the prefetching behavior is
user-visible in a confusing way.

This doesn't seem like it's specific to SELECT FOR UPDATE, either.
In any situation where the FOR query has visible side effects,
the prefetching behavior is going to be user-visible if the loop
body does something that's sensitive to those side effects. This
would at least include INSERT/UPDATE/DELETE RETURNING and queries
containing volatile functions.

There are several things I can think of that we might do about this:

1. Leave the code alone, but document that the prefetching behavior exists.
Users who run into this are on their own to work around it.

2. Try to detect whether the FOR query has any visible side-effects,
and shut off prefetching if so.

3. Document the prefetching behavior and provide a way for users to
shut it off if it's a problem in their specific case.

I'm not particularly attracted by #2; it would be difficult to do with
100% confidence, and it would probably result in a noticeable performance
penalty, with benefits to only a very small percentage of users.
(The prefetch code has been there since 2001, and the number of complaints
about it would certainly not take more than one hand to count.)

#3 is probably the thing to do, but I'm not volunteering to do it
myself. Or maybe we should just do #1 --- again, the number of
people hitting this has been vanishingly small.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David Johnston 2013-11-12 21:34:53 Re: BUG #8448: looping through query results exits at 10th step under some conditions
Previous Message Pavel Stehule 2013-11-12 16:00:24 Re: Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist