Re: Two-phase commit issues

From: Alvaro Herrera <alvherre(at)surnet(dot)cl>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Two-phase commit issues
Date: 2005-05-18 23:06:35
Message-ID: 20050518230635.GB10521@surnet.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 18, 2005 at 05:15:09PM -0400, Tom Lane wrote:
> I've started to look seriously at Heikki's patch for two-phase commit.

Hum. I started a few days ago doing some reviewing, with the intention
of correcting some things here and there in order to present it all to
you later, with a pre-filter to get some bugs out.

> There are a few issues that probably deserve discussion:
>
> * The major missing issue that I've come across so far is that
> subtransaction and multixact state isn't preserved across a crash.
[...]
> (AFAICS it's sufficient to make each subxact link directly to the top
> XID, even if there was a more complex hierarchy originally.)

Right, we don't care about the hierarchy; we know all those subXids were
committed.

> Similarly, we've got to reconstruct MultiXactIds that any prepared
> xacts are members of, else row-level locks taken out by prepared xacts
> won't be enforced correctly. I think this can be handled if we add to
> the state files a list of all MultiXactIds that each prepared xact
> belongs to, and then during restart forcibly recreate those
> MultiXactIds. (They would only be rebuilt with prepared XIDs, not any
> ordinary XIDs that might originally have been members.) This seems to
> require some new code in multixact.c, but not anything fundamentally
> difficult --- Alvaro, do you see any likely problems in this stuff?

I'm not sure if it affects in any way that a Xid=1, which participates
in a MultiXactId is seen as not prepared when Xid=2 prepares, which also
participates in the same MultiXactId; if Xid=1 is prepared later, the
MultiXactId needs to be restored with both Xids as participants.

> * The patch is designed to dump state files into WAL as well as onto
> disk. Why? Wouldn't it be better just to write and fsync the state
> file before reporting successful prepare? That would get rid of the
> need for checkpoint-time fsyncs.

I made the same observation.

> * I'm inclined to think that the "gid" identifiers for prepared
> transactions ought to be SQL identifiers (names), not string literals.
> Was there a particular reason for making them strings?

Ditto.

> * There are some fairly ugly cases associated with creation and deletion
> of temporary tables as well. I think we might want to just decree that
> you can't PREPARE a transaction that included creating or dropping a
> temp table. Does anyone have much of a problem with that?

Does this affect any of the other things that use the direct-fsync-no-WAL
path in the smgr?

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Having your biases confirmed independently is how scientific progress is
made, and hence made our great society what it is today" (Mary Gardiner)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2005-05-18 23:10:17 Re: Learning curves and such (was Re: pgFoundry)
Previous Message Noel 2005-05-18 22:49:24 Re: Image storage questions