Re: MVCC catalog access

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MVCC catalog access
Date: 2013-07-02 13:02:01
Message-ID: 20130702130201.GA19837@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-07-01 15:02:41 -0400, Robert Haas wrote:
> >> * These are updated by GetSnapshotData. We initialize them this way
> >> @@ -177,6 +188,9 @@ GetTransactionSnapshot(void)
> >> else
> >> CurrentSnapshot = GetSnapshotData(&CurrentSnapshotData);
> >>
> >> + /* Don't allow catalog snapshot to be older than xact snapshot. */
> >> + CatalogSnapshotStale = true;
> >> +
> > Do we really need to invalidate snapshots in either situation? Isn't it
> > implied, that if it's still valid, according to a) no invalidation via local
> > invalidation messages b) no invalidations from other backends, there
> > shouldn't be any possible differences when you only look at the catalog?
>
> I had the same thought, removed that code, and then put it back. The
> problem is that if we revive an older snapshot "from the dead", so to
> speak, our backend's advertised xmin might need to go backwards, and
> that seems unsafe - e.g. suppose another backend has updated a tuple
> but not yet committed. We don't see any invalidation messages so
> decide reuse our existing (old) snapshot and begin a scan. After
> we've looked at the page containing the new tuple (and decided not to
> see it), vacuum nukes the old tuple (which we then also don't see).
> Bad things ensue. It might be possible to avoid the majority of
> problems in this area via an appropriate set of grotty hacks, but I
> don't want to go there.

Yes, I thought about that and I think it's a problem that can be solved
without too ugly hacks. But, as you say:

> Yeah, I think there's room for further fine-tuning there. But I think
> it would make sense to push the patch at this point, and then if we
> find cases that can be further improved, or things that it breaks, we
> can fix them. This area is complicated enough that I wouldn't be
> horribly surprised if we end up having to fix a few more problem cases
> or even revert the whole thing, but I think we've probably reached the
> point where further review has less value than getting the code out
> there in front of more people and seeing where (if anywhere) the
> wheels come off out in the wild.

I am pretty sure that we will have to fix more stuff, but luckily we're
in the beginning of the cycle. And while I'd prefer more eyes on the
patch before it gets applied, especially ones knowledgeable about the
implications this has, I don't really see that happening soon. So
applying is more likely to lead to more review than waiting.

So, from me: +1.

Some things that might be worth changing when committing:
* Could you add a Assert(!RelationHasSysCache(relid)) to
RelationInvalidatesSnapshotsOnly? It's not unlikely that it will be
missed by the next person adding a syscache and that seems like it
could have ugly and hard to detect consequences.
* maybe use bsearch(3) instead of open coding the binary search? We
already use it in the backend.
* possibly paranoid, but I'd add a Assert to heap_beginscan_catalog or
GetCatalogSnapshot ensuring we're dealing with a catalog relation. The
consistency mechanisms in GetCatalogSnapshot() only work for those, so
there doesn't seem to be a valid usecase for non-catalog relations.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sawada Masahiko 2013-07-02 13:14:01 Re: Patch for fail-back without fresh backup
Previous Message Peter Eisentraut 2013-07-02 12:46:44 Re: [PATCH] Add an ldapoption to disable chasing LDAP referrals