Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality)

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jan Wieck <jan(at)wi3ck(dot)info>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality)
Date: 2014-04-14 08:06:17
Message-ID: 534B96F9.6040602@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 04/13/2014 11:39 PM, Heikki Linnakangas wrote:
> However, I just noticed that there's a race condition between PREPARE
> TRANSACTION and COMMIT/ROLLBACK PREPARED. PostPrepare_Locks runs after
> the prepared transaction is already marked as fully prepared. That means
> that by the time we get to PostPrepare_Locks, another backend might
> already have finished and removed the prepared transaction. That leads
> to a PANIC (put a breakpoint just before PostPrepare_Locks):
>
> postgres=# commit prepared 'foo';
> PANIC: failed to re-find shared proclock object
> PANIC: failed to re-find shared proclock object
> The connection to the server was lost. Attempting reset: Failed.
>
> FinishPrepareTransaction reads the list of locks from the two-phase
> state file, but PANICs when it doesn't find the corresponding locks in
> the lock manager (because PostPrepare_Locks hasn't transfered them to
> the dummy PGPROC yet).
>
> I think we'll need to transfer of the locks earlier, before the
> transaction is marked as fully prepared. I'll take a closer look at this
> tomorrow.

Here's a patch to do that. It's very straightforward, I just moved the
calls to transfer locks earlier, before ProcArrayClearTransaction.
PostPrepare_MultiXact had a similar race - it also transfer state from
the old PGPROC entry to the new, and needs to be done before allowing
another backend to remove the new PGPROC entry. I changed the names of
the functions to distinguish them from the other PostPrepare_* functions
that now happen at a different time.

The patch is simple, but it's a bit scary to change the order of things
like this. Looking at all the calls that now happen after transferring
the locks, I believe this is OK. The change also applies to the
callbacks called by the RegisterXactCallback mechanism, which means that
in theory there might be a 3rd party extension out there that's
affected. All the callbacks in contrib and plpgsql are OK, and it's
questionable to do anything complicated that would depend on
heavy-weight locks to be held in those callbacks, so I think this is OK.
Warrants a note in the release notes, though.

- Heikki

Attachment Content-Type Size
fix-prepare-transaction-race-1.patch text/x-diff 4.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Albe Laurenz 2014-04-14 08:19:51 Re: CLOB & BLOB limitations in PostgreSQL
Previous Message Christian Ullrich 2014-04-14 06:16:09 Re: PostgreSQL in Windows console and Ctrl-C