Re: [GENERAL] PITR and tar

Lists: pgsql-docspgsql-general
From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: PITR and tar
Date: 2007-05-07 18:58:06
Message-ID: 1178564286.23358.7.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-general

The docs recommend using tar to perform a base backup for PITR.

Usually, tar reports notices like:
"tar: Truncated write; file may have grown while being archived."

First of all, is the tar archive still safe if those errors occur?

Second, it seems that it can cause a bad backup to occur if you pass the
"z" option to tar. Instead, piping the output of tar through the
compression program seems to avoid that problem (i.e. "tar cf - ... |
gzip > ..."). I am using FreeBSD's tar, other implementations may be
different.

Are my observations correct, and if so, should they be documented as a
potential "gotcha" when making base backups?

Regards,
Jeff Davis


From: "Albe Laurenz" <all(at)adv(dot)magwien(dot)gv(dot)at>
To: "Jeff Davis *EXTERN*" <pgsql(at)j-davis(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: PITR and tar
Date: 2007-05-08 06:47:36
Message-ID: AFCCBB403D7E7A4581E48F20AF3E5DB20291A531@EXADV1.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-general

> The docs recommend using tar to perform a base backup for PITR.
>
> Usually, tar reports notices like:
> "tar: Truncated write; file may have grown while being archived."

Did you call pg_start_backup(text) before you started to archive?

Yours,
Laurenz Albe


From: Jim Nasby <decibel(at)decibel(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: PITR and tar
Date: 2007-05-08 15:25:58
Message-ID: 8C4728BC-E2DE-4E99-813A-3A15C3F97A80@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-general

On May 7, 2007, at 1:58 PM, Jeff Davis wrote:
> Second, it seems that it can cause a bad backup to occur if you
> pass the
> "z" option to tar. Instead, piping the output of tar through the
> compression program seems to avoid that problem (i.e. "tar cf - ... |
> gzip > ..."). I am using FreeBSD's tar, other implementations may be
> different.

What *exactly* are you seeing there? If anything -z should be safer
than piping through gzip, since you could easily accidentally pipe
stderr through gzip as well, which *would* corrupt the backup.

> Are my observations correct, and if so, should they be documented as a
> potential "gotcha" when making base backups?

I believe the bit about tar complaining about changed files is
already in there, no?
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Jim Nasby <decibel(at)decibel(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: PITR and tar
Date: 2007-05-08 17:14:24
Message-ID: 1178644464.24902.16.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-general

On Tue, 2007-05-08 at 10:25 -0500, Jim Nasby wrote:
> On May 7, 2007, at 1:58 PM, Jeff Davis wrote:
> > Second, it seems that it can cause a bad backup to occur if you
> > pass the
> > "z" option to tar. Instead, piping the output of tar through the
> > compression program seems to avoid that problem (i.e. "tar cf - ... |
> > gzip > ..."). I am using FreeBSD's tar, other implementations may be
> > different.
>
> What *exactly* are you seeing there? If anything -z should be safer
> than piping through gzip, since you could easily accidentally pipe
> stderr through gzip as well, which *would* corrupt the backup.
>

tar: Truncated write; file may have grown while being archived.
tar: Truncated write; file may have grown while being archived.
tar: GZip compression failed

is the output from my cron script (which is emailed to me). This
happened several times in a row. When I tried to extract one of those
backups, I got errors like (some names have been changed):

$ tar zxf mybackup.tar.gz
data/base/16418/32309.1: Premature end of gzip compressed data:
Input/output error
tar: Premature end of gzip compressed data: Input/output error

and
$ gzip -dc mybackup.tar.gz > /dev/null
gzip: ../mybackup.tar.gz: unexpected end of file
gzip: ../mybackup.tar.gz: uncompress failed

This may be specific to FreeBSD's tar. I remember testing in the past on
Linux and never had these problems.

When I changed to do it as a pipe instead of using the "z" flag, it
worked fine. I still get the stderr properly (which is also emailed to
me via cron) but only contains the "truncated write" warnings.

> > Are my observations correct, and if so, should they be documented as a
> > potential "gotcha" when making base backups?
>
> I believe the bit about tar complaining about changed files is
> already in there, no?

I was talking about using the "z" flag with tar causing potential bad
backups as described above, not just the warnings. If that's true, there
are probably other people with untrustworthy backups.

Regards,
Jeff Davis


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Albe Laurenz <all(at)adv(dot)magwien(dot)gv(dot)at>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: PITR and tar
Date: 2007-05-08 17:16:50
Message-ID: 1178644610.24902.20.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-general

On Tue, 2007-05-08 at 08:47 +0200, Albe Laurenz wrote:
> > The docs recommend using tar to perform a base backup for PITR.
> >
> > Usually, tar reports notices like:
> > "tar: Truncated write; file may have grown while being archived."
>
> Did you call pg_start_backup(text) before you started to archive?
>

I was referring to the result of the tar itself being a corrupted gzip
file (that couldn't be uncompressed with gunzip).

I did indeed call pg_start/stop_backup().

Regards,
Jeff Davis


From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Jeff Davis" <pgsql(at)j-davis(dot)com>
Cc: "Albe Laurenz" <all(at)adv(dot)magwien(dot)gv(dot)at>, pgsql-general(at)postgresql(dot)org
Subject: Re: PITR and tar
Date: 2007-05-08 17:24:28
Message-ID: b42b73150705081024v5151d47am620423865ed7f524@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-general

On 5/8/07, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Tue, 2007-05-08 at 08:47 +0200, Albe Laurenz wrote:
> > > The docs recommend using tar to perform a base backup for PITR.
> > >
> > > Usually, tar reports notices like:
> > > "tar: Truncated write; file may have grown while being archived."
> >
> > Did you call pg_start_backup(text) before you started to archive?
> >
>
> I was referring to the result of the tar itself being a corrupted gzip
> file (that couldn't be uncompressed with gunzip).
>
> I did indeed call pg_start/stop_backup().

is fsync on?

merlin


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Albe Laurenz <all(at)adv(dot)magwien(dot)gv(dot)at>, pgsql-general(at)postgresql(dot)org
Subject: Re: PITR and tar
Date: 2007-05-08 17:28:50
Message-ID: 1178645330.24902.22.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-general

On Tue, 2007-05-08 at 13:24 -0400, Merlin Moncure wrote:
> On 5/8/07, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> > On Tue, 2007-05-08 at 08:47 +0200, Albe Laurenz wrote:
> > > > The docs recommend using tar to perform a base backup for PITR.
> > > >
> > > > Usually, tar reports notices like:
> > > > "tar: Truncated write; file may have grown while being archived."
> > >
> > > Did you call pg_start_backup(text) before you started to archive?
> > >
> >
> > I was referring to the result of the tar itself being a corrupted gzip
> > file (that couldn't be uncompressed with gunzip).
> >
> > I did indeed call pg_start/stop_backup().
>
> is fsync on?
>

Yes. I have a battery-backed cache as well, and there were no power
failures involved.

Regards,
Jeff Davis


From: "Dhaval Shah" <dhaval(dot)shah(dot)m(at)gmail(dot)com>
To: "Jeff Davis" <pgsql(at)j-davis(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: PITR and tar
Date: 2007-05-09 15:45:21
Message-ID: 565237760705090845j765e91bet522bb07cfb8995aa@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-general

Looks like a problem specific to FreeBSD. I use Centos/postgres 8.2.3
and I do not see that problem at all.

Dhaval

On 5/8/07, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Tue, 2007-05-08 at 13:24 -0400, Merlin Moncure wrote:
> > On 5/8/07, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> > > On Tue, 2007-05-08 at 08:47 +0200, Albe Laurenz wrote:
> > > > > The docs recommend using tar to perform a base backup for PITR.
> > > > >
> > > > > Usually, tar reports notices like:
> > > > > "tar: Truncated write; file may have grown while being archived."
> > > >
> > > > Did you call pg_start_backup(text) before you started to archive?
> > > >
> > >
> > > I was referring to the result of the tar itself being a corrupted gzip
> > > file (that couldn't be uncompressed with gunzip).
> > >
> > > I did indeed call pg_start/stop_backup().
> >
> > is fsync on?
> >
>
> Yes. I have a battery-backed cache as well, and there were no power
> failures involved.
>
> Regards,
> Jeff Davis
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>

--
Dhaval Shah


From: Jim Nasby <decibel(at)decibel(dot)org>
To: Dhaval Shah <dhaval(dot)shah(dot)m(at)gmail(dot)com>
Cc: "Jeff Davis" <pgsql(at)j-davis(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: PITR and tar
Date: 2007-05-09 16:40:57
Message-ID: B244343D-69D6-4700-8D10-D95E9FCBF4FA@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-general

Actually, looking at the docs, the problem is with some versions of
GNU tar. AFAIK bsdtar is perfectly happy to archive files that have
changed from underneath it.

On May 9, 2007, at 10:45 AM, Dhaval Shah wrote:

> Looks like a problem specific to FreeBSD. I use Centos/postgres 8.2.3
> and I do not see that problem at all.
>
> Dhaval
>
> On 5/8/07, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>> On Tue, 2007-05-08 at 13:24 -0400, Merlin Moncure wrote:
>> > On 5/8/07, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>> > > On Tue, 2007-05-08 at 08:47 +0200, Albe Laurenz wrote:
>> > > > > The docs recommend using tar to perform a base backup for
>> PITR.
>> > > > >
>> > > > > Usually, tar reports notices like:
>> > > > > "tar: Truncated write; file may have grown while being
>> archived."
>> > > >
>> > > > Did you call pg_start_backup(text) before you started to
>> archive?
>> > > >
>> > >
>> > > I was referring to the result of the tar itself being a
>> corrupted gzip
>> > > file (that couldn't be uncompressed with gunzip).
>> > >
>> > > I did indeed call pg_start/stop_backup().
>> >
>> > is fsync on?
>> >
>>
>> Yes. I have a battery-backed cache as well, and there were no power
>> failures involved.
>>
>> Regards,
>> Jeff Davis
>>
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 3: Have you checked our extensive FAQ?
>>
>> http://www.postgresql.org/docs/faq
>>
>
>
> --
> Dhaval Shah
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>

--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Jim Nasby <decibel(at)decibel(dot)org>
Cc: Dhaval Shah <dhaval(dot)shah(dot)m(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: PITR and tar
Date: 2007-05-09 17:19:05
Message-ID: 1178731145.24902.70.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-general

On Wed, 2007-05-09 at 11:40 -0500, Jim Nasby wrote:
> Actually, looking at the docs, the problem is with some versions of
> GNU tar. AFAIK bsdtar is perfectly happy to archive files that have
> changed from underneath it.
>

$ tar --version
bsdtar 1.2.53 - libarchive 1.3.1

That fails to create a file in proper gzip format when the files are
concurrently modified.

However,

$ tar --version
tar (GNU tar) 1.14
Copyright (C) 2004 Free Software Foundation, Inc.
This program comes with NO WARRANTY, to the extent permitted by law.
You may redistribute it under the terms of the GNU General Public
License;
see the file named COPYING for details.
Written by John Gilmore and Jay Fenlason.

That _appears_ to work.

Perhaps FreeBSD users should take notice of this problem. It's certainly
not a postgresql problem, but I know there are a lot of freebsd users
here, and using tar on fast-changing data may be rare outside of
postgresql.

Regards,
Jeff Davis


From: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Dhaval Shah <dhaval(dot)shah(dot)m(at)gmail(dot)com>, pgsql-docs(at)postgresql(dot)org
Subject: Re: [GENERAL] PITR and tar
Date: 2007-05-13 22:44:09
Message-ID: 20070513224408.GC69517@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-general

Moving to -docs...

Does anyone know what the history of the docs saying that GNU tar had
issues with files changing underneath it? According to this report it's
actually BSD tar that has the issue.

On Wed, May 09, 2007 at 10:19:05AM -0700, Jeff Davis wrote:
> On Wed, 2007-05-09 at 11:40 -0500, Jim Nasby wrote:
> > Actually, looking at the docs, the problem is with some versions of
> > GNU tar. AFAIK bsdtar is perfectly happy to archive files that have
> > changed from underneath it.
> >
>
> $ tar --version
> bsdtar 1.2.53 - libarchive 1.3.1
>
> That fails to create a file in proper gzip format when the files are
> concurrently modified.
>
> However,
>
> $ tar --version
> tar (GNU tar) 1.14
> Copyright (C) 2004 Free Software Foundation, Inc.
> This program comes with NO WARRANTY, to the extent permitted by law.
> You may redistribute it under the terms of the GNU General Public
> License;
> see the file named COPYING for details.
> Written by John Gilmore and Jay Fenlason.
>
> That _appears_ to work.
>
> Perhaps FreeBSD users should take notice of this problem. It's certainly
> not a postgresql problem, but I know there are a lot of freebsd users
> here, and using tar on fast-changing data may be rare outside of
> postgresql.
>
> Regards,
> Jeff Davis
>

--
Jim Nasby decibel(at)decibel(dot)org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Dhaval Shah <dhaval(dot)shah(dot)m(at)gmail(dot)com>, pgsql-docs(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: [GENERAL] PITR and tar
Date: 2007-05-14 01:04:53
Message-ID: 200705140104.l4E14rn21632@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-general

Jim C. Nasby wrote:
> Moving to -docs...
>
> Does anyone know what the history of the docs saying that GNU tar had
> issues with files changing underneath it? According to this report it's
> actually BSD tar that has the issue.

As I remember, Tom was the one who found that GNU tar would return a
non-zero exit status if the file changed during backup, so you couldn't
determine if the backup was successful based on the exit code.

---------------------------------------------------------------------------

>
> On Wed, May 09, 2007 at 10:19:05AM -0700, Jeff Davis wrote:
> > On Wed, 2007-05-09 at 11:40 -0500, Jim Nasby wrote:
> > > Actually, looking at the docs, the problem is with some versions of
> > > GNU tar. AFAIK bsdtar is perfectly happy to archive files that have
> > > changed from underneath it.
> > >
> >
> > $ tar --version
> > bsdtar 1.2.53 - libarchive 1.3.1
> >
> > That fails to create a file in proper gzip format when the files are
> > concurrently modified.
> >
> > However,
> >
> > $ tar --version
> > tar (GNU tar) 1.14
> > Copyright (C) 2004 Free Software Foundation, Inc.
> > This program comes with NO WARRANTY, to the extent permitted by law.
> > You may redistribute it under the terms of the GNU General Public
> > License;
> > see the file named COPYING for details.
> > Written by John Gilmore and Jay Fenlason.
> >
> > That _appears_ to work.
> >
> > Perhaps FreeBSD users should take notice of this problem. It's certainly
> > not a postgresql problem, but I know there are a lot of freebsd users
> > here, and using tar on fast-changing data may be rare outside of
> > postgresql.
> >
> > Regards,
> > Jeff Davis
> >
>
> --
> Jim Nasby decibel(at)decibel(dot)org
> EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org/

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Dhaval Shah <dhaval(dot)shah(dot)m(at)gmail(dot)com>, pgsql-docs(at)postgresql(dot)org
Subject: Re: [GENERAL] PITR and tar
Date: 2007-05-14 01:46:21
Message-ID: 23024.1179107181@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-general

"Jim C. Nasby" <decibel(at)decibel(dot)org> writes:
> Does anyone know what the history of the docs saying that GNU tar had
> issues with files changing underneath it? According to this report it's
> actually BSD tar that has the issue.

It seems to be a different issue. The problem with GNU tar is that it
issues a warning and exits with nonzero status, which is a problem for
backup scripts because they can't easily distinguish this case from an
actual failure. But AFAIK the output file is self-consistent anyway.
It sounds like bsd tar is just plain broken :-(

regards, tom lane