Re: Problem with txid_snapshot_in/out() functionality

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jan Wieck <jan(at)wi3ck(dot)info>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Problem with txid_snapshot_in/out() functionality
Date: 2014-04-14 09:15:30
Message-ID: 534BA732.20203@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 04/12/2014 05:03 PM, Andres Freund wrote:
> On 2014-04-12 09:47:24 -0400, Tom Lane wrote:
>> Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:
>>> On 04/12/2014 12:07 AM, Jan Wieck wrote:
>>>> the Slony team has been getting seldom reports of a problem with the
>>>> txid_snapshot data type.
>>>> The symptom is that a txid_snapshot on output lists the same txid
>>>> multiple times in the xip list part of the external representation.
>>
>>> It's two-phase commit. When preparing a transaction, the state of the
>>> transaction is first transfered to a dummy PGXACT entry, and then the
>>> PGXACT entry of the backend is cleared. There is a transient state when
>>> both PGXACT entries have the same xid.
>>
>> Hm, yeah, but why is that intermediate state visible to anyone else?
>> Don't we have exclusive lock on the PGPROC array while we're doing this?
>
> It's done outside the remit of ProcArray lock :(. And documented to lead
> to duplicate xids in PGXACT.
> EndPrepare():
> /*
> * Mark the prepared transaction as valid. As soon as xact.c marks
> * MyPgXact as not running our XID (which it will do immediately after
> * this function returns), others can commit/rollback the xact.
> *
> * NB: a side effect of this is to make a dummy ProcArray entry for the
> * prepared XID. This must happen before we clear the XID from MyPgXact,
> * else there is a window where the XID is not running according to
> * TransactionIdIsInProgress, and onlookers would be entitled to assume
> * the xact crashed. Instead we have a window where the same XID appears
> * twice in ProcArray, which is OK.
> */
> MarkAsPrepared(gxact);
>
> It doesn't sound too hard to essentially move PrepareTransaction()'s
> ProcArrayClearTransaction() into MarkAsPrepared() and rejigger the
> locking to remove the intermediate state. But I think it'll lead to
> bigger changes than we'd be comfortable backpatching.

Hmm. There's a field in GlobalTransactionData called locking_xid, which
is used to mark the XID of the transaction that's currently operating on
the prepared transaction. At prepare, that ensures that the transaction
cannot be committed or rolled back by another backend until the original
backend has cleared its PGPROC entry. At COMMIT/ROLLBACK PREPARED, it
ensures that only one backend can commit/rollback the transaction.

I wonder why we don't use a VirtualTransactionId there. AFAICS there is
no reason for COMMIT/ROLLBACK PREPARED to be assigned an XID of its own.
And if we used a VirtualTransactionId there, prepare could clear the xid
field of the PGPROC entry earlier.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-04-14 09:32:56 Re: Problem with txid_snapshot_in/out() functionality
Previous Message Etsuro Fujita 2014-04-14 09:03:52 Minor improvements in create_foreign_table.sgml