Re: BUG #8686: Standby could not restart.

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Tomonari Katsumata <t(dot)katsumata1122(at)gmail(dot)com>
Cc: Tomonari Katsumata <katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #8686: Standby could not restart.
Date: 2014-01-06 14:38:16
Message-ID: 52CABFD8.8090101@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 12/23/2013 08:15 AM, Tomonari Katsumata wrote:
>> /*
>>> * Initialize shared replayEndRecPtr,
>>> lastReplayedEndRecPtr, and
>>> * recoveryLastXTime.
>>> *
>>> * This is slightly confusing if we're starting from an
>>> online
>>> * checkpoint; we've just read and replayed the
>>> checkpoint record, but
>>> * we're going to start replay from its redo pointer,
>>> which precedes
>>> * the location of the checkpoint record itself. So even
>>> though the
>>> * last record we've replayed is indeed ReadRecPtr, we
>>> haven't
>>> * replayed all the preceding records yet. That's OK for
>>> the current
>>> * use of these variables.
>>> */
>>> SpinLockAcquire(&xlogctl->info_lck);
>>> xlogctl->replayEndRecPtr = ReadRecPtr;
>>> xlogctl->lastReplayedEndRecPtr = EndRecPtr;
>>> xlogctl->recoveryLastXTime = 0;
>>> xlogctl->currentChunkStartTime = 0;
>>> xlogctl->recoveryPause = false;
>>> SpinLockRelease(&xlogctl->info_lck);
>>>
>>
>> I think we need to fix that confusion. Your patch will do it by not
>> setting EndRecPtr yet; that fixes the bug, but leaves those variables in a
>> slightly strange state; I'm not sure what EndRecPtr points to in that case
>> (0 ?), but ReadRecPtr would be set I guess.
>>
> Yes, the values were set like below.
> ReadRecPtr:1/8E7F0B0
> EndRecPtr:0/0
>
>
>
>>
>> Perhaps we should reset replayEndRecPtr and lastReplayedEndRecPtr to the
>> REDO point here, instead of ReadRecPtr/EndRecPtr.
>
> I made another patch.
> I added a ReadRecord to make sure the REDO location is present or not.
> The similar process are done when we use backup_label.
>
> Because the ReadRecord returns a record already read,
> I set ReadRecPtr of the record to EndRecPtr.
> And also I set record->xl_prev to ReadRecPtr.
> As you said, it also worked fine.
>
> I'm not sure we should do same thing when crash recovery occurs, but now I
> added the process when archive recovery is needed.
>
> Please see attached patch.

Hmm. That would still initialize lastReplayedEndRecPtr to the checkpoint
record, when you do crash recovery (or begin archive recovery from a
filesystem snapshot, where you perform crash recovery before starting to
read the archive). I'm not very comfortable with that, even though I
don't see an immediate problem with it.

I also noticed that CheckRecoveryConsistency() compares backupEndPoint
with EndRecPtr, but minRecoveryPoint with lastReplayedEndRecPtr. That's
correct as the code stands, but it's an accident waiting to happen: if
you called CheckRecoveryConsistency() after reading a record with
ReadRecord(), but before fully replaying it, it might conclude that it's
reached the backup end location one record too early. And it's
inconsistent, anyway.

I propose the attached patch. I haven't tested it, but I think it's a
slightly more robust fix.

- Heikki

Attachment Content-Type Size
fix-lastReplayedEndRecPtr-1.patch text/x-diff 3.2 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2014-01-06 14:45:32 Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages
Previous Message Heikki Linnakangas 2014-01-06 14:35:42 Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages