Quick Links

pg_basebackup from cascading standby after timeline switch

From:	Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To:	PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject:	pg_basebackup from cascading standby after timeline switch
Date:	2012-12-17 14:16:09
Message-ID:	50CF2929.5070603@vmware.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

pg_basebackup -x is supposed to include all the required WAL files in
the backup, so that you have everything needed to restore a consistent
database. However, it's not including the timeline history files.
Usually that's not a problem because normally you don't need to follow
any old timelines when restoring, but there is one scenario where it
causes a failure to restore:

Create a master, a standby, and a cascading standby. Kill the master
server, promote the standby to become new master, bumping the timeline.
After the cascading standby has followed the timeline switch (either
through the archive, which also works on 9.2, or directly via streaming
replication which only works on 9.3devel), take a base backup from the
cascading standby using pg_basebackup -x. When you try to start the
server from the new backup (without setting up a restore_command or
streaming replication), you get an error about "unexpected timeline ID 1
in log segment ..."

C 2012-12-17 15:55:25.732 EET 534 LOG: database system was interrupted
while in recovery at log time 2012-12-17 15:55:15 EET
C 2012-12-17 15:55:25.732 EET 534 HINT: If this has occurred more than
once some data might be corrupted and you might need to choose an
earlier recovery target.
C 2012-12-17 15:55:25.732 EET 534 LOG: creating missing WAL directory
"pg_xlog/archive_status"
C 2012-12-17 15:55:25.732 EET 534 LOG: unexpected timeline ID 1 in log
segment 000000020000000000000003, offset 0
C 2012-12-17 15:55:25.732 EET 534 LOG: invalid checkpoint record
C 2012-12-17 15:55:25.733 EET 534 FATAL: could not locate required
checkpoint record
C 2012-12-17 15:55:25.733 EET 534 HINT: If you are not restoring from a
backup, try removing the file
"/home/heikki/pgsql.master/data-standbyC/backup_label".
C 2012-12-17 15:55:25.733 EET 533 LOG: startup process (PID 534) exited
with exit code 1
C 2012-12-17 15:55:25.733 EET 533 LOG: aborting startup due to startup
process failure

The timeline was bumped within the log segment 000000020000000000000003,
so the beginning of the file uses timeline 1, up to the checkpoint
record that changes the timeline. Normally, recovery accepts that
because timeline 1 is an ancestor of timeline 2, but because the backup
does not include the timelime history file, it does not know that.

This does not happen if you run pg_basebackup against the master server,
because in the master it forces an xlog switch, which ensures that the
new xlog file only contains pages with the latest timeline ID. There's
even comments in pg_start_backup explaining that that's the reason for
the xlog switch:

> /*
> * Force an XLOG file switch before the checkpoint, to ensure that the
> * WAL segment the checkpoint is written to doesn't contain pages with
> * old timeline IDs. That would otherwise happen if you called
> * pg_start_backup() right after restoring from a PITR archive: the
> * first WAL segment containing the startup checkpoint has pages in
> * the beginning with the old timeline ID. That can cause trouble at
> * recovery: we won't have a history file covering the old timeline if
> * pg_xlog directory was not included in the base backup and the WAL
> * archive was cleared too before starting the backup.
> *
> * This also ensures that we have emitted a WAL page header that has
> * XLP_BKP_REMOVABLE off before we emit the checkpoint record.
> * Therefore, if a WAL archiver (such as pglesslog) is trying to
> * compress out removable backup blocks, it won't remove any that
> * occur after this point.
> *
> * During recovery, we skip forcing XLOG file switch, which means that
> * the backup taken during recovery is not available for the special
> * recovery case described above.
> */
> if (!backup_started_in_recovery)
> RequestXLogSwitch();

I'm not happy with the fact that we just ignore the problem in a backup
taken from a standby, silently giving the user a backup that won't start
up. Why not include the timeline history file in the backup? That seems
like a good idea regardless of this issue. I also wonder if
pg_basebackup should include *all* timeline history files in the backup,
not just the latest one strictly required to restore. They're fairly
small, so our approach has generally been to try to include them all in
the archive, and not try to prune them, so the same might make sense here.

- Heikki

Responses

Re: pg_basebackup from cascading standby after timeline switch at 2012-12-17 16:19:15 from Tom Lane
Re: pg_basebackup from cascading standby after timeline switch at 2012-12-17 22:44:08 from Simon Riggs

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrew Dunstan	2012-12-17 14:29:09	Re: Makefiles don't seem to remember to rebuild everything anymore
Previous Message	Pavan Deolasee	2012-12-17 13:52:00	Re: Makefiles don't seem to remember to rebuild everything anymore