Re: hot backups: am I doing it wrong, or do we have a problem with pg_clog?

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Daniel Farina <daniel(at)heroku(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: hot backups: am I doing it wrong, or do we have a problem with pg_clog?
Date: 2011-04-21 14:05:43
Message-ID: BANLkTikKjTNwx+0uGHMcjDFVPMdKHxGgPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 21, 2011 at 6:15 AM, Daniel Farina <daniel(at)heroku(dot)com> wrote:
> To start at the end of this story: "DETAIL:  Could not read from file
> "pg_clog/007D" at offset 65536: Success."
>
> This is a message we received on a a standby that we were bringing
> online as part of a test.  The clog file was present, but apparently
> too small for Postgres (or at least I tihnk this is what the message
> meant), so one could stub in another clog file and then continue
> recovery successfully (modulus the voodoo of stubbing in clog files in
> general).  I am unsure if this is due to an interesting race condition
> in Postgres or a result of my somewhat-interesting hot-backup
> protocol, which is slightly more involved than the norm.  I will
> describe what it does here:
>
> 1) Call pg start backup
> 2) crawl the entire postgres cluster directory structure, except
> pg_xlog, taking notes of the size of every file present
> 3) begin writing TAR files, but *only up to the size noted during the
> original crawling of the cluster directory,* so if the file grows
> between the original snapshot and subsequently actually calling read()
> on the file those extra bytes will not be added to the TAR.
>  3a) If a file is truncated partially, I add "\0" bytes to pad the
> tarfile member up to the size sampled in step 2, as I am streaming the
> tar file and cannot go back in the stream and adjust the tarfile
> member size
> 4) call pg stop backup
>
> The reason I go to this trouble is because I use many completely
> disjoint tar files to do parallel compression, decompression,
> uploading, and downloading of the base backup of the database, and I
> want to be able to control the size of these files up-front.  The
> requirement of stubbing in \0 is because of a limitation of the tar
> format when dealing with streaming archives and the requirement to
> truncate the files to the size snapshotted in the step 2 is to enable
> splitting up the files between volumes even in the presence of
> possible concurrent growth while I'm performing the hot backup. (ex: a
> handful of nearly-empty heap files can rapidly grow due to a
> concurrent bulk load if I get unlucky, which I do not intend to allow
> myself to be).
>
> Any ideas?  Or does it sound like I'm making some bookkeeping errors
> and should review my code again?  It does work most of the time.  I
> have not gotten a sense how often this reproduces just yet.

Everyone here is going to assume the problem is in your (too?) fancy
tar/diff delta archiving approach because we can't see that code and
it just sounds suspicious. A busted clog file is of course very
noteworthy but to eliminate your stuff you should try reproducing
using a more standard method of grabbing the base backup.

Have you considered using rsync instead?

merlin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-04-21 14:26:16 Re: smallserial / serial2
Previous Message Kevin Grittner 2011-04-21 13:42:22 Re: Formatting Curmudgeons WAS: MMAP Buffers