Re: Update with subselect sometimes returns wrong result

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Oliver Seemann <oseemann(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Update with subselect sometimes returns wrong result
Date: 2013-12-01 06:41:02
Message-ID: 22220.1385880062@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> Slightly hacked up (probably python 2 only) version of the test script
> attached.

Ah, deleting and reinserting the rows each time like that makes it a
lot more reproducible. I don't have the full story but it's got something
to do with screwed-up tuple info flags. The plan that's generated looks
like

Update on t1
-> Nested Loop
Join Filter: (t1.id = subset.id)
-> Seq Scan on t1
-> Subquery Scan on subset
-> Limit
-> LockRows
-> Seq Scan on t1 t1_1

so the subquery scan will be executed twice, once for each row in t1.
The first time through, heap_lock_tuple looks at the first tuple and
locks it. It doesn't get called on the second tuple because the LIMIT
node stops evaluation. But in the second scan, heap_lock_tuple returns
HeapTupleSelfUpdated for the first tuple, so ExecLockRows ignores it,
moves on to the second tuple, and returns that. The LIMIT is happy cause
it just got one row back; it doesn't know it's not the same row as
before. And this row of course passes the join qual with the second
row from the outer scan of t1, so we perform a second update.

Now, the thing about this is that the tuple heap_lock_tuple is rejecting
in the second pass is the one that we just updated, up at the ModifyTable
plan node. So I can't find it exactly surprising that it says
HeapTupleSelfUpdated. But tracing through tqual.c shows that the tuple
has got the HEAP_XMAX_IS_MULTI bit set, which might be thought a bit
peculiar. There's not multiple transactions involved.

Anyway, at this point I'm not so much wondering why it fails as why it
(seems to) work *any* of the time. And how is it that VACUUM sometimes
manages to flip it from working state to not-working state? (Once you're
in the state where the UPDATE will say it updated two rows, it's 100%
reproducible.)

Anyway, it seems pretty clear that the explanation is down somewhere in
the tuple visibility and multixact logic that you and Alvaro have been
hacking on with such vim lately. I'm out of steam for tonight, over
to you ...

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2013-12-01 07:03:55 Re: Update with subselect sometimes returns wrong result
Previous Message David Johnston 2013-12-01 03:00:01 Re: Update with subselect sometimes returns wrong result