Re: mvcc catalo gsnapshots and TopTransactionContext

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: mvcc catalo gsnapshots and TopTransactionContext
Date: 2014-02-02 22:50:39
Message-ID: 20140202225039.GR5930@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-02-02 15:16:45 -0500, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On February 2, 2014 5:52:22 PM CET, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> More to the point, changing the Assert so it doesn't fire
> >> doesn't do one damn thing to ameliorate the fact that cache reload
> >> during transaction abort is wrong and unsafe.
>
> > And, as upthread, I still don't think that's correct. I don't have
> > sources available right now, but IIRC we already have aborted out of the
> > transaction. Released locks, the xid and everything.
>
> Nope ... the case exhibited in the example is dying in AtEOSubXact_Inval,
> which is in the very midst of subxact abort.

True. But we've done LWLockReleaseAll(), TransactionIdAbortTree(),
XidCacheRemoveRunningXids() and
ResourceOwnerRelease(RESOURCE_RELEASE_BEFORE_LOCKS), which is why we are
currently able to build correct entries, even though we are in an
aborted transaction.

> I've been thinking about this for the past little while, and I believe
> that it's probably okay to have RelationClearRelation leave the relcache
> entry un-rebuilt, but with rd_isvalid = false so it will be rebuilt when
> next opened. The rationale is explained in the comments in the attached
> patch. I've checked that this fixes Noah's test case and still passes
> the existing regression tests.

Hm, a bit scary, but I don't see an immediate problem.

The following comment now is violated for nailed relations
* We assume that at the time we are called, we have at least AccessShareLock
* on the target index. (Note: in the calls from RelationClearRelation,
* this is legitimate because we know the rel has positive refcount.)
but that should be easy to fix.

I wonder though, if we couldn't just stop doing the
RelationReloadIndexInfo() for nailed indexes. The corresponding comment
says:
* If it's a nailed index, then we need to re-read the pg_class row to see
* if its relfilenode changed. We do that immediately if we're inside a
* valid transaction. Otherwise just mark the entry as possibly invalid,
* and it'll be fixed when next opened.
*/

but any relfilenode change should have already been handled by
RelationInitPhysicalAddr()?

Do you plan to backpatch this? If so, even to 8.4?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2014-02-02 22:52:42 Re: GSOC13 proposal - extend RETURNING syntax
Previous Message Peter Eisentraut 2014-02-02 21:57:28 Re: pg_basebackup and pg_stat_tmp directory