Index-only scans and non-MVCC snapshots

From: Ryan Johnson <ryan(dot)johnson(at)cs(dot)utoronto(dot)ca>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Index-only scans and non-MVCC snapshots
Date: 2014-06-27 04:47:47
Message-ID: 53ACF773.50200@cs.utoronto.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

As part of a research project, I'm trying to change Read Committed
isolation to use HeapTupleSatisfiesNow rather than acquiring a new
snapshot at every command [1]. Things appear to have gone reasonably
well so far, except certain queries fail with "ERROR: non-MVCC
snapshots are not supported in index-only scans."

I'm using v9.3.2, and the docs claim that index-only scans work without
MVCC, but require some extra locking to avoid races [2]. Is this not
actually implemented? If that is the case, shouldn't the query optimizer
avoid selecting index-only scans for non-MVCC snapshots?

I realize I'm playing with fire here, but any pointers to sections of
code I might look at to either work around or fix this issue would be
greatly appreciated. I've been looking around in index_fetch_heap
(indexam.c) as well as other locations that use scan->xs_continue_hot;
there seems to be code in place to detect when a non-MVCC snapshot is in
use, as if that were nothing out of the ordinary, but nothing prevents
the error from arising if a hot chain is actually encountered.

Thanks,
Ryan

[1] Right now, Read Committed is significantly *slower* than Repeatable
Read---for transactions involving multiple short commands---because the
former acquires multiple snapshots per transaction and causes a lwlock
bottleneck on my 12-core machine.

[2] http://www.postgresql.org/docs/9.3/static/index-locking.html:
> with a non-MVCC-compliant snapshot (such as SnapshotNow), it would be
> possible to accept and return a row that does not in fact match the
> scan keys ... [so] we use a pin on an index page as a proxy to
> indicate that the reader might still be "in flight" from the index
> entry to the matching heap entry. Making ambulkdelete block on such a
> pin ensures that VACUUM cannot delete the heap entry before the reader
> is done with it. ... This solution requires that index scans be
> "synchronous": we have to fetch each heap tuple immediately after
> scanning the corresponding index entry. This is expensive for a number
> of reasons. An "asynchronous" scan in which we collect many TIDs from
> the index, and only visit the heap tuples sometime later, requires
> much less index locking overhead and can allow a more efficient heap
> access pattern. Per the above analysis, we must use the synchronous
> approach for non-MVCC-compliant snapshots.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-06-27 05:04:46 Re: Index-only scans and non-MVCC snapshots
Previous Message Amit Kapila 2014-06-27 04:22:57 Re: ALTER SYSTEM RESET?