Re: mvcc catalo gsnapshots and TopTransactionContext

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: mvcc catalo gsnapshots and TopTransactionContext
Date: 2013-10-04 19:15:36
Message-ID: CA+TgmoaevkKbAp=4QgWrX=pLscsROv4b0Gpuh1KNzwq-xX7TAA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 9, 2013 at 11:17 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2013-08-09 14:11:46 -0400, Tom Lane wrote:
>> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
>> > On 2013-08-08 09:27:24 -0400, Robert Haas wrote:
>> >> How can it be safe to try to read catalogs if the transaction is aborted?
>>
>> > Well. It isn't. At least not in general. The specific case triggered
>> > here though are cache invalidations being processed which can lead to
>> > the catalog being read (pretty crummy, but not easy to get rid
>> > of). That's actually safe since before we process the invalidations we
>> > have done:
>> > 1) CurrentTransactionState->state = TRANS_ABORT
>> > 2) RecordTransactionAbort(), marking the transaction as aborted in the
>> > clog
>> > 3) marked subxacts as aborted
>> > 3) ProcArrayEndTransaction() (for toplevel ones)
>>
>> > Due to these any tqual stuff will treat the current (sub-)xact and it's
>> > children as aborted. So the catalog lookups will use the catalog in a
>> > sensible state.
>>
>> I don't have any faith in this argument. You might be right that we'll
>> correctly see our own output rows as aborted, but that's barely the tip
>> of the iceberg of risk here. Is it safe to take new locks in an aborted
>> transaction? (What if we're already past the lock-release point in
>> the abort sequence?)
>
> Don't get me wrong. I find the idea of doing catalog lookup during abort
> horrid. But it's been that way for at least 10 years (I checked 7.4), so
> it has at least some resemblance of working.
> Today we do a good bit less than back then, for one we don't do a full
> cache reload during abort anymore, just for the index support
> infrastructure. Also, you've reduced the amount of lookups a bit with the
> relmapper introduction.
>
>> For that matter, given that we don't know what
>> exactly caused the transaction abort, how safe is it to do anything at
>> all --- we might for instance be nearly out of memory. If the catalog
>> reading attempt itself fails, won't we be in an infinite loop of
>> transaction aborts?
>
> Looks like that's possible, yes. There seem to be quite some other
> opportunities for this to happen if you look at the amount of work done
> in AbortSubTransaction(). I guess it rarely happens because we
> previously release some memory...
>
>> I could probably think of ten more risks if I spent a few more minutes
>> at it.
>
> No need to convince me here. I neither could believe we were doing this,
> nor figure out why it even "works" for the first hour of looking at it.
>
>> Cache invalidation during abort should *not* lead to any attempt to
>> immediately revalidate the cache. No amount of excuses will make that
>> okay. I have not looked to see just what the path of control is in this
>> particular case, but we need to fix it, not paper over it.
>
> I agree, although that's easier said than done in the case of
> subtransactions. The problem we have there is that it's perfectly valid
> to still have references to a relation from the outer, not aborted,
> transaction. Those need to be valid for anybody looking at the relcache
> entry after we've processed the ROLLBACK TO/...
>
> I guess the fix is something like we do in the commit case, where we
> transfer invalidations to the parent transaction. If we then process
> local invalidations *after* we've cleaned up the subtransaction
> completely we should be fine. We already do an implicity
> CommandCounterIncrement() in CommitSubTransaction()...

Andres, are you (or is anyone) going to try to fix this assertion failure?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-10-04 19:20:37 Re: mvcc catalo gsnapshots and TopTransactionContext
Previous Message Peter Geoghegan 2013-10-04 18:09:03 Re: Any reasons to not move pgstattuple to core?