Re: Hard limit on WAL space used (because PANIC sucks)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Hard limit on WAL space used (because PANIC sucks)
Date: 2014-01-22 15:41:55
Message-ID: 20140122154155.GJ21170@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-01-21 21:42:19 -0500, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2014-01-21 19:45:19 -0500, Tom Lane wrote:
> >> I don't think that's a comparable case. Incomplete actions are actions
> >> to be taken immediately, and which the replayer then has to complete
> >> somehow if it doesn't find the rest of the action in the WAL sequence.
> >> The only thing to be done with the records I'm proposing is to remember
> >> their contents (in some fashion) until it's time to apply them. If you
> >> hit end of WAL you don't really have to do anything.
>
> > Would that work for the promotion case as well? Afair there's the
> > assumption that everything >= TransactionXmin can be looked up in
> > pg_subtrans or in the procarray - which afaics wouldn't be the case with
> > your scheme? And TransactionXmin could very well be below such an
> > "incomplete commit"'s xids afaics.
>
> Uh, what? The behavior I'm talking about is *exactly the same*
> as what happens now. The only change is that the data sent to the
> WAL file is laid out a bit differently, and the replay logic has
> to work harder to reassemble it before it can apply the commit or
> abort action. If anything outside replay can detect a difference
> at all, that would be a bug.
>
> Once again: the replayer is not supposed to act immediately on the
> subsidiary records. It's just supposed to remember their contents
> so it can reattach them to the eventual commit or abort record,
> and then do what it does today to replay the commit or abort.

I (think) I get what you want to do, but splitting the record like that
nonetheless opens up behaviour that previously wasn't there. Imagine we
promote inbetween replaying the list of subxacts (only storing it in
memory) and the main commit record. Either we have something like the
incomplete action stuff doing something with the in-memory data, or we
are in a situation where there can be xids bigger than TransactionXmin
that are not in pg_subtrans and not in the procarray. Which I don't
think exists today since we either read the commit record in it's
entirety or not.
We'd also need to use the MyPgXact->delayChkpt mechanism to prevent
checkpoints from occuring inbetween those records, but we do that
already, so that seems rather uncontroversial.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-01-22 15:48:10 Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Previous Message Andres Freund 2014-01-22 15:34:58 Re: Changeset Extraction v7.1