Re: pg_subtrans and WAL

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_subtrans and WAL
Date: 2004-08-20 17:36:39
Message-ID: 751.1093023399@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl> writes:
> On Tue, Aug 10, 2004 at 12:24:06PM -0400, Tom Lane wrote:
>> It may be that we do not care because pg_subtrans doesn't have to be
>> valid after a crash, but I haven't seen any proof of that theory.

> The whole point of the subtrans info is to be available _while_ the
> transaction tree is running. If there is a crash, then by definition no
> backend can be running when we return, so pg_subtrans info is useless at
> that point. We only need pg_clog to be correct.

But we also have to be sure that we don't try to access the useless info
anyway. For instance some pre-crash subxacts might remain marked
SUBCOMMITTED in clog indefinitely. I think this could be worked around:
for example, TransactionIdDidCommit could assume that any SUBCOMMITTED
xact older than RecentGlobalXmin must represent a child of a crashed
parent. It shouldn't be too hard to guarantee that we never touch
pg_subtrans for XIDs older than RecentGlobalXmin. We don't have that
guarantee in place at the moment though.

>> And if that theory is correct, then it is a seriously bad design to be
>> using the same code infrastructure for both pg_clog and pg_subtrans.
>> Every fsync on pg_subtrans is wasted effort if that is going to be our
>> approach.

> Right, but AFAICS both pg_clog and pg_subtrans are only fsync'ed during
> checkpoint and shutdown, so it doesn't seem that costly. We could
> certainly skip calling CheckPointSUBTRANS() or making it a noop ...

The point is that the behaviors are fundamentally different. We have no
need for any WAL log entries for pg_subtrans; we should never fsync it;
and the rules for deciding when and where to truncate it are a lot
different (or at least should be different). I thought from the
beginning that the slru layer underneath pg_clog was bad from the point
of view of obfuscating the code, because it forced an awkward division
of labor between clog.c and slru.c. Now that I realize that there's not
that much behavior that we really want to share, I wonder whether we
shouldn't revert that change and make subtrans.c stand on its own.

> On a related note: if we mark a Xid with SUBTRANS COMMIT and later crash
> without updating it, the main Xid will remain in in-progress status. At
> what point is it marked aborted?

I do not think there's any guarantee that it ever will be so marked.
Certainly it could be a very long time until someone exhibits any
interest in that particular Xid's status...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2004-08-20 18:04:16 Re: postgres uptime
Previous Message Gaetano Mendola 2004-08-20 17:35:50 7.4.5 on RH 2.1AS