PITR and Compressed WALS

From: Brian Wipf <brian(at)clickspace(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: PITR and Compressed WALS
Date: 2007-10-02 20:36:16
Message-ID: C47CC7B5-A4AE-4570-BBB5-600D0D1D4751@clickspace.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

We have two PostgreSQL 8.2.4 servers. On one database, WALs are
archived with a simple script that gzips and transfers them to an NFS
file server. The other database is in perpetual recovery mode,
ungizipping and processing the WALs as they appear and become
complete on the file server. This has been running fine for the past
few days. As soon as the gzipped WAL appears in the archived WAL
directory, I see an entry in the logs that the file has been restored.

Last night, I brought the database out of its perpetual recovery
mode. Here are the lines from the log when this was done:
[2007-10-01 23:43:03 MDT] LOG: restored log file
"000000010000046600000060" from archive
[2007-10-01 23:45:50 MDT] LOG: could not open file "pg_xlog/
000000010000046600000061" (log file 1126, segment 97): No such file
or directory
[2007-10-01 23:45:50 MDT] LOG: redo done at 466/60000070

Which is all fine, since 000000010000046600000060.gz was the last
archived WAL file. The next entry in the log follows:

[2007-10-01 23:45:50 MDT] PANIC: could not open file "pg_xlog/
000000010000046600000060" (log file 1126, segment 96): No such file
or directory
[2007-10-01 23:45:51 MDT] LOG: startup process (PID 27624) was
terminated by signal 6
[2007-10-01 23:45:51 MDT] LOG: aborting startup due to startup
process failure
[2007-10-01 23:45:51 MDT] LOG: logger shutting down

And the database would not start up. The issue appears to be that the
restore_command script itself ungzips the WAL to its destination %p,
and the WAL is left in the archive directory as
000000010000046600000060.gz. By simply ungzipping the last few WALs
manually in the archive directory, the database replayed them and
started up successfully.

I'm not sure if this should be listed as another caveat on the PITR
recovery page but in the very least I wanted to post to the list so
that others attempting to archive and recover compressed WALs may be
aware of a potential issue.

Brian Wipf
<brian(at)clickspace(dot)com>

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ross Bagley 2007-10-02 20:41:22 Finding number of rows deleted in a stored procedure
Previous Message paul rivers 2007-10-02 20:00:06 Re: Partitioned table limitation