Re: [ADMIN] Attempt to re-archive existing WAL logsafterrestoringfrom backup

Lists: pgsql-adminpgsql-hackers
From: Jon Colverson <pgsql(at)vcxz(dot)co(dot)uk>
To: pgsql-admin(at)postgresql(dot)org
Subject: Attempt to re-archive existing WAL logs after restoring from backup
Date: 2007-05-29 04:10:45
Message-ID: 465BA7C5.6080400@vcxz.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

Hello.

I've been testing my DB backup and restore procedure and I've run into
something I can't figure out. After recovering from a PITR backup, when
I do another pg_start_backup PostgreSQL attempts to re-archive the last
log again, which fails because it already exists in the archive. I'm
running the Ubuntu 7.04 version of PostgreSQL 8.2.4. Here's an example
log interspersed with what I was doing in the gaps (sorry for posting
the whole thing; I'm new to PostgreSQL so I'm not sure what's
significant and what isn't):

--------
2007-05-29 02:18:33 UTC LOG: archived transaction log file
"000000010000000000000012"
2007-05-29 02:18:34 UTC LOG: incomplete startup packet
2007-05-29 02:18:34 UTC LOG: received fast shutdown request
2007-05-29 02:18:34 UTC LOG: shutting down
2007-05-29 02:18:34 UTC LOG: database system is shut down

Deleted data directory and restored from base backup. Restarted DB:

2007-05-29 02:20:33 UTC LOG: could not load root certificate file
"root.crt": no SSL error reported
2007-05-29 02:20:33 UTC DETAIL: Will not verify client certificates.
2007-05-29 02:20:33 UTC LOG: database system was interrupted at
2007-05-29 02:16:06 UTC
2007-05-29 02:20:33 UTC LOG: starting archive recovery
2007-05-29 02:20:33 UTC LOG: restore_command =
"/usr/local/sbin/xlogrestore %f %p"
2007-05-29 02:20:33 UTC LOG: incomplete startup packet
2007-05-29 02:20:34 UTC LOG: restored log file
"000000010000000000000011.000118A0.backup" from archive
2007-05-29 02:20:34 UTC FATAL: the database system is starting up
2007-05-29 02:20:34 UTC FATAL: the database system is starting up
2007-05-29 02:20:35 UTC LOG: restored log file
"000000010000000000000011" from archive
2007-05-29 02:20:35 UTC LOG: checkpoint record is at 0/110118A0
2007-05-29 02:20:35 UTC LOG: redo record is at 0/110118A0; undo record
is at 0/0; shutdown FALSE
2007-05-29 02:20:35 UTC LOG: next transaction ID: 0/975; next OID: 24576
2007-05-29 02:20:35 UTC LOG: next MultiXactId: 1; next MultiXactOffset: 0
2007-05-29 02:20:35 UTC LOG: automatic recovery in progress
2007-05-29 02:20:35 UTC LOG: redo starts at 0/110118E8
2007-05-29 02:20:35 UTC FATAL: the database system is starting up
2007-05-29 02:20:35 UTC LOG: restored log file
"000000010000000000000012" from archive
2007-05-29 02:20:36 UTC FATAL: the database system is starting up
2007-05-29 02:20:36 UTC LOG: could not open file
"pg_xlog/000000010000000000000013" (log file 0, segment 19): No such
file or directory
2007-05-29 02:20:36 UTC LOG: redo done at 0/120000C8
2007-05-29 02:20:36 UTC FATAL: the database system is starting up
2007-05-29 02:20:37 UTC FATAL: the database system is starting up
2007-05-29 02:20:37 UTC FATAL: the database system is starting up
2007-05-29 02:20:38 UTC FATAL: the database system is starting up
2007-05-29 02:20:38 UTC FATAL: the database system is starting up
2007-05-29 02:20:39 UTC FATAL: the database system is starting up
2007-05-29 02:20:39 UTC FATAL: the database system is starting up
2007-05-29 02:20:39 UTC LOG: incomplete startup packet
2007-05-29 02:20:39 UTC LOG: restored log file
"000000010000000000000012" from archive
2007-05-29 02:20:39 UTC LOG: archive recovery complete
2007-05-29 02:20:40 UTC LOG: database system is ready

SELECT pg_switch_xlog(); (It doesn't make any difference if I do this
command or not)

2007-05-29 02:22:04 UTC LOG: archived transaction log file
"000000010000000000000013"

SELECT pg_start_backup('testbackup');

gzip: stdout: Broken pipe
2007-05-29 02:22:46 UTC LOG: archive command
"/usr/local/sbin/xlogarchive pg_xlog/000000010000000000000012
000000010000000000000012" failed: return code 256

gzip: stdout: Broken pipe
2007-05-29 02:22:48 UTC LOG: archive command
"/usr/local/sbin/xlogarchive pg_xlog/000000010000000000000012
000000010000000000000012" failed: return code 256

gzip: stdout: Broken pipe
2007-05-29 02:22:50 UTC LOG: archive command
"/usr/local/sbin/xlogarchive pg_xlog/000000010000000000000012
000000010000000000000012" failed: return code 256
2007-05-29 02:22:50 UTC WARNING: transaction log file
"000000010000000000000012" could not be archived: too many failures
--------

Thanks for any advice.

--
Jon


From: Jon Colverson <pgsql(at)vcxz(dot)co(dot)uk>
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: Attempt to re-archive existing WAL logs after restoring from backup
Date: 2007-05-31 04:58:26
Message-ID: 465E55F2.2020103@vcxz.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

Jon Colverson wrote:
> I've been testing my DB backup and restore procedure and I've run into
> something I can't figure out. After recovering from a PITR backup, when
> I do another pg_start_backup PostgreSQL attempts to re-archive the last
> log again, which fails because it already exists in the archive.

Hello again. I've been investigating this some more and I've got some
more details to add:

In xlog.c, exitArchiveRecovery() moves the last restored log file into
XLOGDIR.

Later (when shutting down the cluster, for instance),
XLogArchiveCheckDone() is called on the log file. This function looks
for a ".done" or ".ready" file and finds neither, so it calls
XLogArchiveNotify() to create the ".ready" file. This causes the
archiver to come along later and attempt to archive the log and this
fails because the log already exists in the archive (that's where it
came from).

So, I think the problem is that exitArchiveRecovery() should be marking
the restored log file as ".done" (or perhaps not bothering to restore
the log file at all?).

This looks like a bug to me. Any thoughts?

--
Jon


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jon Colverson <pgsql(at)vcxz(dot)co(dot)uk>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Attempt to re-archive existing WAL logs after restoring from backup
Date: 2007-05-31 14:01:53
Message-ID: 28998.1180620113@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

Jon Colverson <pgsql(at)vcxz(dot)co(dot)uk> writes:
> So, I think the problem is that exitArchiveRecovery() should be marking
> the restored log file as ".done" (or perhaps not bothering to restore
> the log file at all?).

> This looks like a bug to me. Any thoughts?

I don't think that will fix it. The problem is that after recovery
we try to use the rest of the last WAL segment for fresh WAL data, and
therefore need to modify the segment, and therefore need to (re) archive
it. We can't fix that by doing a forced switch to a new segment, because
that still requires writing a switch record at the right place in the
last segment. There's really no way to make this work without
abandoning the principle that the archive process should never overwrite
files. (Which is certainly not a logically *necessary* thing, but it
seems like a mighty good idea from a safety perspective.)

I believe that this situation doesn't come up in ordinary scenarios
because a WAL segment doesn't get archived until it's really filled.
I take it you are using a backup procedure that undertakes to force the
last partial segment into the archive after a postmaster shutdown. I'd
recommend not doing it that way --- instead, if you want a deterministic
time that the backup is good up to, don't stop the postmaster at all,
just apply pg_switch_xlog() and let the regular archiving process
archive the thereby-completed segment file.

regards, tom lane


From: Jon Colverson <pgsql(at)vcxz(dot)co(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Attempt to re-archive existing WAL logs after restoring from backup
Date: 2007-05-31 15:09:53
Message-ID: 465EE541.3020501@vcxz.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

Thanks for your reply.

Tom Lane wrote:
> Jon Colverson <pgsql(at)vcxz(dot)co(dot)uk> writes:
>> So, I think the problem is that exitArchiveRecovery() should be marking
>> the restored log file as ".done" (or perhaps not bothering to restore
>> the log file at all?).
>
>> This looks like a bug to me. Any thoughts?
>
> I don't think that will fix it. The problem is that after recovery
> we try to use the rest of the last WAL segment for fresh WAL data, and
> therefore need to modify the segment, and therefore need to (re) archive
> it. We can't fix that by doing a forced switch to a new segment, because
> that still requires writing a switch record at the right place in the
> last segment. There's really no way to make this work without
> abandoning the principle that the archive process should never overwrite
> files. (Which is certainly not a logically *necessary* thing, but it
> seems like a mighty good idea from a safety perspective.)

I'm not sure this is quite right. Any log in the archive is, by
definition, a complete log that will have the switch record already
(because it will only have been marked for archiving after being
completed). Hence I don't really understand why the last one is being
copied back to pg_xlog after being used in the recovery. The long
comment in exitArchiveRecovery() explains that this is to replace any
existing version of the file in pg_xlog, but I think it would be just as
valid to simply remove any existing version since it is definitely out
of date (iff restoredFromArchive).

> I believe that this situation doesn't come up in ordinary scenarios
> because a WAL segment doesn't get archived until it's really filled.
> I take it you are using a backup procedure that undertakes to force the
> last partial segment into the archive after a postmaster shutdown. I'd

No; nothing is touching the archive except for the archive_command being
run by PostgreSQL.

--
Jon


From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Jon Colverson" <pgsql(at)vcxz(dot)co(dot)uk>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Attempt to re-archive existing WAL logs afterrestoring from backup
Date: 2007-06-01 11:02:34
Message-ID: 1180695755.26297.101.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

On Thu, 2007-05-31 at 16:09 +0100, Jon Colverson wrote:
> >
> >> This looks like a bug to me. Any thoughts?
> >

> No; nothing is touching the archive except for the archive_command being
> run by PostgreSQL.

Jon,

You're still troubled by what you've seen?

Can you start from the beginning again and describe exactly what set of
circumstances you're in. I want to remove any bug or doubt that there is
one lurking.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Jon Colverson" <pgsql(at)vcxz(dot)co(dot)uk>
Cc: <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Attempt to re-archive existing WAL logs afterrestoringfrom backup
Date: 2007-06-01 20:02:19
Message-ID: 1180728140.26297.261.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

On Fri, 2007-06-01 at 12:02 +0100, Simon Riggs wrote:
> On Thu, 2007-05-31 at 16:09 +0100, Jon Colverson wrote:
> > >
> > >> This looks like a bug to me. Any thoughts?
> > >
>
> > No; nothing is touching the archive except for the archive_command being
> > run by PostgreSQL.
>
> Jon,
>
> You're still troubled by what you've seen?
>
> Can you start from the beginning again and describe exactly what set of
> circumstances you're in. I want to remove any bug or doubt that there is
> one lurking.

I've spent some time re-examining the code and I can't see a route for
the discussed problem to occur. All the main code paths do not exhibit
this problem in 8.2, so I'm not sure where to go from here.

If you can help further, please give me a shout.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


From: Jon Colverson <pgsql(at)vcxz(dot)co(dot)uk>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Attempt to re-archive existing WAL logs afterrestoringfrom backup
Date: 2007-06-02 00:07:44
Message-ID: 4660B4D0.4090102@vcxz.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

Simon Riggs wrote:
> I've spent some time re-examining the code and I can't see a route for
> the discussed problem to occur. All the main code paths do not exhibit
> this problem in 8.2, so I'm not sure where to go from here.
>
> If you can help further, please give me a shout.

Thanks a lot for looking into this.

I've attached a self-contained minimal shell script which sets up a
cluster and demonstrates the problem. I hope that will provide an
unambiguous description. I'm running it on Ubuntu 7.04 (i386) and the
script includes all the commands to build PostgreSQL from scratch, but
obviously you can snip that stuff out and change the paths to suit your
environment.

The script ends with the server log showing the archiver attempting to
re-archive the last log that was restored in the recovery procedure.

Thanks again.

--
Jon

Attachment Content-Type Size
pgsql-duplicate-xlog-demo.sh text/plain 2.5 KB

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Jon Colverson" <pgsql(at)vcxz(dot)co(dot)uk>
Cc: <pgsql-admin(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Attempt to re-archive existing WAL logsafterrestoringfrom backup
Date: 2007-06-04 07:49:47
Message-ID: 1180943388.2870.40.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

On Sat, 2007-06-02 at 01:07 +0100, Jon Colverson wrote:
> Simon Riggs wrote:
> > I've spent some time re-examining the code and I can't see a route for
> > the discussed problem to occur. All the main code paths do not exhibit
> > this problem in 8.2, so I'm not sure where to go from here.
> >
> > If you can help further, please give me a shout.
>
> Thanks a lot for looking into this.
>
> I've attached a self-contained minimal shell script which sets up a
> cluster and demonstrates the problem.

OK, thanks. I'll run with it from here.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Jon Colverson" <pgsql(at)vcxz(dot)co(dot)uk>
Cc: <pgsql-admin(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Attempt to re-archive existing WAL logsafterrestoringfrom backup
Date: 2007-06-07 21:20:32
Message-ID: 1181251233.26886.296.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

On Sat, 2007-06-02 at 01:07 +0100, Jon Colverson wrote:
> Simon Riggs wrote:
> > I've spent some time re-examining the code and I can't see a route for
> > the discussed problem to occur. All the main code paths do not exhibit
> > this problem in 8.2, so I'm not sure where to go from here.
> >
> > If you can help further, please give me a shout.
>
> Thanks a lot for looking into this.
>
> I've attached a self-contained minimal shell script which sets up a
> cluster and demonstrates the problem. I hope that will provide an
> unambiguous description. I'm running it on Ubuntu 7.04 (i386) and the
> script includes all the commands to build PostgreSQL from scratch, but
> obviously you can snip that stuff out and change the paths to suit your
> environment.
>
> The script ends with the server log showing the archiver attempting to
> re-archive the last log that was restored in the recovery procedure.

Jon,

Your script is correct and there is a problem, as you demonstrate. Thank
you for bringing this to the list.

You have followed the manual in full. In Step 6 of the recovery
procedure outlined here
http://developer.postgresql.org/pgdocs/postgres/continuous-archiving.html#BACKUP-PITR-RECOVERY
we say that if you have partially completed WAL logs these should be
copied back into pg_xlog. This is correct and if we do this we avoid the
error you have highlighted. In the case of a true disaster recovery, as
you have attempted to test, no files are available and so an attempt is
made to archive the last restored file. That attempt fails iff you have
done as the manual suggests and used an archive_command that prevents
duplicate file insertions.

I'll repost to pgsql-hackers to discuss solutions. Fixes are simple, but
require some discussion.

Initially I thought you'd fallen foul of another problem which is known
to exist, which is caused by immediately shutting down a server after
having taken a hot backup. This isn't a problem in real usage though is
a problem in a scripted test. I've already proposed solutions there and
will pick up that thread again also.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Jon Colverson" <pgsql(at)vcxz(dot)co(dot)uk>
Cc: <pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: [ADMIN] Attempt to re-archive existing WAL logsafterrestoringfrom backup
Date: 2007-06-07 21:23:06
Message-ID: 1181251387.26886.300.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-hackers

On Thu, 2007-06-07 at 22:20 +0100, Simon Riggs wrote:
> On Sat, 2007-06-02 at 01:07 +0100, Jon Colverson wrote:
> > Simon Riggs wrote:
> > > I've spent some time re-examining the code and I can't see a route for
> > > the discussed problem to occur. All the main code paths do not exhibit
> > > this problem in 8.2, so I'm not sure where to go from here.
> > >
> > > If you can help further, please give me a shout.
> >
> > Thanks a lot for looking into this.
> >
> > I've attached a self-contained minimal shell script which sets up a
> > cluster and demonstrates the problem. I hope that will provide an
> > unambiguous description. I'm running it on Ubuntu 7.04 (i386) and the
> > script includes all the commands to build PostgreSQL from scratch, but
> > obviously you can snip that stuff out and change the paths to suit your
> > environment.
> >
> > The script ends with the server log showing the archiver attempting to
> > re-archive the last log that was restored in the recovery procedure.
>
> Jon,
>
> Your script is correct and there is a problem, as you demonstrate. Thank
> you for bringing this to the list.
>
> You have followed the manual in full. In Step 6 of the recovery
> procedure outlined here
> http://developer.postgresql.org/pgdocs/postgres/continuous-archiving.html#BACKUP-PITR-RECOVERY
> we say that if you have partially completed WAL logs these should be
> copied back into pg_xlog. This is correct and if we do this we avoid the
> error you have highlighted. In the case of a true disaster recovery, as
> you have attempted to test, no files are available and so an attempt is
> made to archive the last restored file. That attempt fails iff you have
> done as the manual suggests and used an archive_command that prevents
> duplicate file insertions.
>
> I'll repost to pgsql-hackers to discuss solutions. Fixes are simple, but
> require some discussion.

Currently if we perform an archive recovery and the end of logs occur
during or at the end of an xlog that has already been archived then we
will fail and keep failing to archive anything. Thats not much fun and
it appears to me that this problem goes back to early days of PITR and
may require backpatching.

Currently if an xlog file has a missing archive_status file then we
create a .ready status, which leads to the command being retried, which
might fail or might not. Since the last file on an archive recovery
seldom has an archive status this leads to failure situations when the
archive_command prevents duplicate files from existing in the archive.

AFAICS we have four options:

1. We can assume that if an xlog file has no archive status it must have
been archived. Hmmm, sounds like a poor assumption.

2. We can relax the recommendation to have an archive_command that
prevents duplicate logs. Although I haven't in recent times seen this as
too strong a requirement because we have timelines, I don't think we
should relax this.

3. We craft a special archive_status file for the last xlog file in an
archive, so that it never gets archived.

4. As Warren Little recently regretted, if we have a corrupt xlog file
then end of logs happens much earlier than anticipated. When this occurs
many subsequent xlogs would be named similarly to files already in the
archive, yet they would be a separate timeline. We could solve this
problem as well by simply bumping the TLI each time we complete a
recovery, whatever the reason. Presumably the archives will not stretch
back as far as 2 billion recoveries.

Option 3 is the straightforward and fixes the issue directly; option 4
solves many problems in a bold sweep of the design pen, with no obvious
downside, AFAICS. Both are simple and can be backpatched easily.

> Initially I thought you'd fallen foul of another problem which is known
> to exist, which is caused by immediately shutting down a server after
> having taken a hot backup. This isn't a problem in real usage though is
> a problem in a scripted test. I've already proposed solutions there and
> will pick up that thread again also.

On another thread on -hackers.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com