Re: Recovery inconsistencies, standby much larger than primary

From: Greg Stark <stark(at)mit(dot)edu>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Recovery inconsistencies, standby much larger than primary
Date: 2014-02-14 20:46:01
Message-ID: CAM-w4HNy6isA41tnY5sH3=rVswEUD8XfjYc6kGGg5Pwzqxnjdw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Going over this I think this is still a potential issue:

On 31 Jan 2014 15:56, "Andres Freund" <andres(at)2ndquadrant(dot)com> wrote:

>
> I am not sure that explains the issue, but I think the redo action for
> truncation is not safe across crashes. A XLOG_SMGR_TRUNCATE will just
> do a smgrtruncate() (and then mdtruncate) which will iterate over the
> segments starting at 0 till mdnblocks()/segment_size and *truncate* but
> not delete individual segment files that are not needed anymore, right?
> If we crash in the midst of that a new mdtruncate() will be issued, but
> it will get a shorter value back from mdnblocks().
>
> Am I missing something?
>

I'm not too familiar with md.c but my reading of the code is that we
truncate the files in reverse order? In which case I think the code is safe
*iff* the filesystem guarantees ordered meta data writes which I tihnk ext3
does (I think in all the journal modes). Most filesystems meta data writes
are synchronous so the truncates are safe for them too.

But we don't generally rely on meta data writes being ordered. I think the
"correct" thing to do is to record the nblocks prior to the truncate and
then have md.c expose a new function that takes that parameter and pokes
around looking for any segments it might need to clean up. But that would
involve lots of abstraction violations in md.c. I think using nblocks would
keep the violations within md.c but that still seems like a pain.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Beck 2014-02-14 21:16:12 Re: New hook after raw parsing, before analyze
Previous Message Alvaro Herrera 2014-02-14 19:27:56 Re: Release schedule for 9.3.3?