Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>, matioli(dot)matheus(at)gmail(dot)com, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>, Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>, Максим Панченко <Panchenko(at)gw(dot)tander(dot)ru>, Сизов Сергей Павлович <sizov_sp(at)gw(dot)tander(dot)ru>
Subject: Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages
Date: 2014-01-13 21:40:22
Message-ID: 20140113214022.GD5838@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On 2014-01-13 16:36:41 -0500, Tom Lane wrote:
> Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:
> > Good point. Normally, we expect the checksum to match on all pages that
> > we read during WAL replay, because full page writes will initialize any
> > page that is modified to an untorn state, before it's ever read. But we
> > can't rely on that in the extra read that btree_xlog_vacuum() does.
>
> But it's not an "extra" read. It's replicating a read that was done
> on the master in the btvacuumpage() scan. AFAICS the only way to fail
> on the slave and not the master is if the slave has inconsistent data,
> in which case you're at hazard of failing anyway.

I tried to explain which scenario I see as dangerous nearby.

> >> Now, you could argue that that shouldn't be the case because we're only
> >> entering that codepath once STANDBY_SNAPSHOT_READY and you might be
> >> right...
>
> > I don't think that saves us. standbyMode can be STANDBY_SNAPSHOT_READY,
> > before we reach consistency. Adding a check for reachedConsistency,
> > though, ought to fix it.
>
> Huh? Surely we're not letting queries in until we're consistent.

We don't, but STANDBY_SNAPSHOT_READY isn't the only variable controlling
that. It just determines whether we'd have the necessary visibility
information. The full check is:
*/
if (standbyState == STANDBY_SNAPSHOT_READY &&
!LocalHotStandbyActive &&
reachedConsistency &&
IsUnderPostmaster)
{
...
xlogctl->SharedHotStandbyActive = true;
...
SendPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY);
}

So we need to mimick that.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2014-01-13 21:43:45 Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages
Previous Message Andres Freund 2014-01-13 21:36:59 Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-01-13 21:43:45 Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages
Previous Message Andres Freund 2014-01-13 21:36:59 Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages