SSI-related code drift between index_getnext() and heap_hot_search_buffer()

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: SSI-related code drift between index_getnext() and heap_hot_search_buffer()
Date: 2011-05-12 20:01:53
Message-ID: BANLkTikaiW6mkT-Vs0K9wCq4LJ=g1BqEog@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Attempting to rebase on of Heikki's index-only scan patches has led me
to spend most of the day staring at the two functions named in the
subject line. I assume the code in these two functions has a common
origin, as a large amount of it is nearly identical; and one effect of
Heikki's patch is to get rid of the copy of the logic in
index_getnext() and instead call heap_hot_search_buffer(), with some
API changes to make that work. It didn't take me very long to rebase
the code well enough to make the regression tests pass, but the
isolation tests are failing, so I think I've mucked up the predicate
locking or something somewhere in here. Anyhow, that led me to a
careful comparison of the logic in the two above-named functions,
which as you might expect have drifted apart slightly.

In index_getnext(), we have:

if (valid)
{
/*
* If the snapshot is MVCC, we know that it could accept at
* most one member of the HOT chain, so we can skip examining
* any more members. Otherwise, check for continuation of the
* HOT-chain, and set state for next time.
*/
if (IsMVCCSnapshot(scan->xs_snapshot)
&& !IsolationIsSerializable())
scan->xs_next_hot = InvalidOffsetNumber;

I can't help noticing that the code no longer matches the comment
which precedes it. Apparently we can skip examining any more members
only when SSI is not in use, probably because we need to
predicate-lock the remaining tuples in the HOT chain. Another small
difficulty is that the logic in heap_hot_search_buffer() is slightly
different:

/* If it's visible per the snapshot, we must return it */
valid = HeapTupleSatisfiesVisibility(&heapTuple, snapshot, buffer);
CheckForSerializableConflictOut(valid, relation, &heapTuple, buffer);
if (valid)
{
ItemPointerSetOffsetNumber(tid, offnum);
PredicateLockTuple(relation, &heapTuple);
if (all_dead)
*all_dead = false;
if (IsolationIsSerializable())
match_found = true;
else
return true;
}

Here again we're refusing to bypass the remainder of the HOT chain
when the visible tuple is found, if SSI is in use. But the test is
slightly different. Here it applies whenever
IsolationIsSerializable(), whereas in index_getnext() it applies only
when IsolationIsSerializable() *and* we're using an MVCC snapshot.
Does that make any difference? Should we be consistent? Does SSI
care about non-MVCC snapshots?

Another thing I notice is that in heap_hot_search_buffer(), when this
SSI-related change applies, we go through and visit all the remaining
members of the HOT chain and then return the last (and presumably
only) tuple we encountered. But in index_getnext(), we return the
tuple we found immediately and then continue walking the HOT chain on
the next call. This difference seems possibly significant. If it's
necessary for correctness that we predicate-lock the invisible tuples,
then what happens if the scan stops here due to, say, a LIMIT?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Selena Deckelmann 2011-05-12 20:28:18 PgCon 2011 Lightning Talks - Submit yours today!
Previous Message Robert Haas 2011-05-12 19:29:48 Re: Fw: [BUGS] BUG #6011: Some extra messages are output in the event log at PostgreSQL startup