Re: Recovery inconsistencies, standby much larger than primary

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Recovery inconsistencies, standby much larger than primary
Date: 2014-02-13 03:29:13
Message-ID: 8473.1392262153@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark <stark(at)mit(dot)edu> writes:
> On Wed, Feb 12, 2014 at 8:28 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Oh, wait a minute. It's not just a matter of whether we find the right
>> block: we also have to consider whether XLogReadBufferExtended will
>> apply the right "mode" behavior. Currently, it supposes that all pages
>> past the initially observed EOF should be assumed to be uninitialized;
>> but if we're working with an inconsistent database, that seems like
>> an unsafe assumption. It might be that a page is there but we've not
>> (yet) fixed the length of some preceding segment. If we want to not
>> get bogus "WAL contains references to invalid pages" failures in such
>> scenarios, it seems like we need a more invasive change than what
>> I just committed. I think your patch didn't cover this consideration
>> either.

> Hm. I *think* those cases would be handled anyways since the table
> would later be truncated. Arguably any reference after the "short"
> segment is a "reference to an invalid page" since it means it's a
> record which predates the records which caused the extension.

Well, that would be the case if you assume perfectly sequential filesystem
behavior, but I'm not sure the assumption holds if the starting condition
is a base backup. We could be looking at a version of segment 1 that
predates segment 2's existence, and yet see some data in segment 2 as
well, because it's not a perfectly coherent snapshot.

I think what you're arguing is that we should see WAL records filling the
rest of segment 1 before we see any references to segment 2, but if that's
the case then how did we get into the situation you reported? Or is it
just that it was a broken base backup to start with?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2014-02-13 04:12:24 Re: Row-security on updatable s.b. views
Previous Message Fabrízio de Royes Mello 2014-02-13 03:21:40 Re: psql should show disabled internal triggers