Re: Error restoring from a base backup taken from standby

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Error restoring from a base backup taken from standby
Date: 2012-12-18 09:18:50
Message-ID: 50D034FA.8070607@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 18.12.2012 00:35, Simon Riggs wrote:
> On 17 December 2012 17:39, Heikki Linnakangas<hlinnakangas(at)vmware(dot)com> wrote:
>> (This is different from the other issue related to timeline switches I just
>> posted about. There's no timeline switch involved in this one.)
>>
>> If you do "pg_basebackup -x" against a standby server, in some circumstances
>> the backup fails to restore with error like this:
>>
>> C 2012-12-17 19:09:44.042 EET 7832 LOG: database system was not properly
>> shut down; automatic recovery in progress
>> C 2012-12-17 19:09:44.091 EET 7832 LOG: record with zero length at
>> 0/1764F48
>> C 2012-12-17 19:09:44.091 EET 7832 LOG: redo is not required
>> C 2012-12-17 19:09:44.091 EET 7832 FATAL: WAL ends before end of online
>> backup
>> C 2012-12-17 19:09:44.091 EET 7832 HINT: All WAL generated while online
>> backup was taken must be available at recovery.
>> C 2012-12-17 19:09:44.092 EET 7831 LOG: startup process (PID 7832) exited
>> with exit code 1
>> C 2012-12-17 19:09:44.092 EET 7831 LOG: aborting startup due to startup
>> process failure
>>
>> I spotted this bug while reading the code, and it took me quite a while to
>> actually construct a test case to reproduce the bug, so let me begin by
>> discussing the code where the bug is. You get the above error, "WAL ends
>> before end of online backup", when you reach the end of WAL before reaching
>> the backupEndPoint stored in the control file, which originally comes from
>> the backup_label file. backupEndPoint is only used in a base backup taken
>> from a standby, in a base backup taken from the master, the end-of-backup
>> WAL record is used instead to mark the end of backup. In the xlog redo loop,
>> after replaying each record, we check if we've just reached backupEndPoint,
>> and clear it from the control file if we have. Now the problem is, if there
>> are no WAL records after the checkpoint redo point, we never even enter the
>> redo loop, so backupEndPoint is not cleared even though it's reached
>> immediately after reading the initial checkpoint record.
>>
>> To deal with the similar situation wrt. reaching consistency for hot standby
>> purposes, we call CheckRecoveryConsistency() before the redo loop. The
>> straightforward fix is to copy-paste the check for backupEndPoint to just
>> before the redo loop, next to the CheckRecoveryConsistency() call. Even
>> better, I think we should move the backupEndPoint check into
>> CheckRecoveryConsistency(). It's already responsible for keeping track of
>> whether minRecoveryPoint has been reached, so it seems like a good idea to
>> do this check there as well.
>>
>> Attached is a patch for that (for 9.2), as well as a script I used to
>> reproduce the bug. The script is a bit messy, and requires tweaking the
>> paths at the top. Anyone spot a problem with this?
>
> Yep. The problem is one of design, not merely a coding error.
>
> The design assumes that the master is up, connected and doing work.
> Which is perfectly reasonable, since this is the reason we are doing a
> backup from the standby. But obviously not always true, so what we're
> saying is that the selection of backupEndPoint is incorrect in the
> first place. Putting code in to cope with that poor choice at recovery
> seems wrong - we should record the correct location.

Hmm, pg_controlinfo says:

pg_control version number: 922
Catalog version number: 201204301
Database system identifier: 5823207860608399803
Database cluster state: in crash recovery
pg_control last modified: ti 18. joulukuuta 2012 11.00.58
Latest checkpoint location: 0/1764EE8
Prior checkpoint location: 0/1764EE8
Latest checkpoint's REDO location: 0/1764EE8
Latest checkpoint's TimeLineID: 1
...
Minimum recovery ending location: 0/1764F48
Backup start location: 0/1764EE8
Backup end location: 0/1764F48
...

That seems correct to me. The backup is considered valid after reaching
0/1764F48, which is where the checkpoint record ends. minRecoveryPoint
is set to the same. What do you think it should be set to?

> In passing, I see another problem as well. We assume that the backup
> is still valid if we promote the standby to a master during the
> backup, but that isn't documented and so isn't full analysed to see
> that is true.

No, that is taken care of in do_pg_stop_backup:

> /*
> * Parse the BACKUP FROM line. If we are taking an online backup from the
> * standby, we confirm that the standby has not been promoted during the
> * backup.
> */
> ptr = strstr(remaining, "BACKUP FROM:");
> if (!ptr || sscanf(ptr, "BACKUP FROM: %19s\n", backupfrom) != 1)
> ereport(ERROR,
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> errmsg("invalid data in file \"%s\"", BACKUP_LABEL_FILE)));
> if (strcmp(backupfrom, "standby") == 0 && !backup_started_in_recovery)
> ereport(ERROR,
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> errmsg("the standby was promoted during online backup"),
> errhint("This means that the backup being taken is corrupt "
> "and should not be used. "
> "Try taking another online backup.")));

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-12-18 09:30:50 Re: Error restoring from a base backup taken from standby
Previous Message Alexander Korotkov 2012-12-18 09:10:00 Re: WIP: index support for regexp search