Re: pg_basebackup for streaming base backups

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup for streaming base backups
Date: 2011-01-20 12:51:44
Message-ID: AANLkTimefCcoSfj_PY=Mw85hMYjAXVy0eM+QFdEDLcUE@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 20, 2011 at 12:42, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Thu, Jan 20, 2011 at 05:23, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> It's helpful to document what to set to allow pg_basebackup connection.
>> That is not only the REPLICATION privilege but also max_wal_senders and
>> pg_hba.conf.
>
> Hmm. Yeha, i guess that wouldn't hurt. Will add that.

Added, see github branch.

>> + <refsect1>
>> +  <title>Options</title>
>>
>> Can we list the descriptions of option in the same order as
>> "pg_basebackup --help" does?
>>
>> It's helpful to document that the target directory must be specified and
>> it must be empty.
>
> Yeah, that's on the list - I just wanted to make any other changes
> first before I did that. I based on (no further) feedback and a few
> extra questions, I'm going to change it per your suggestion to use -D
> <dir> -F <format>, instead of -D/-T, which will change that stuff
> anyway. So I'll reorder them at that time.

Updated on github.

>> +  <para>
>> +   The backup will include all files in the data directory and tablespaces,
>> +   including the configuration files and any additional files placed in the
>> +   directory by third parties. Only regular files and directories are allowed
>> +   in the data directory, no symbolic links or special device files.
>>
>> The latter sentence means that the backup of the database cluster
>> created by initdb -X is not supported? Because the symlink to the
>> actual WAL directory is included in it.
>
> No, it's not. pg_xlog is specifically excluded, and sent as an empty
> directory, so upon restore you will have an empty pg_xlog directory.

Actually, when I verified that statement, I found a bug where we sent
the wrong thing if pg_xlog was a symlink, leading to a corrupt
tarfile! Patch is in the github branch.

>> OTOH, I found the following source code comments:
>>
>> + * Receive a tar format stream from the connection to the server, and unpack
>> + * the contents of it into a directory. Only files, directories and
>> + * symlinks are supported, no other kinds of special files.
>>
>> This says that symlinks are supported. Which is true? Is the symlink
>> supported only in tar format?
>
> That's actually a *backend* side restriction. If there is a symlink
> anywhere other than pg_tblspc in the data directory, we simply won't
> send it across (with a warning).
>
> The frontend code supports creating symlinks, both in directory format
> and in tar format (actually, in tar format it doesn't do anything, of
> course, it just lets it through)
>
> It wouldn't actually be hard to allow the inclusion of symlinks in the
> backend side. But it would make verification a lot harder - for
> example, if someone symlinked out pg_clog (as an example), we'd back
> up the symlink but not the actual files since they're not actually
> registered as a tablespace.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bernd Helmle 2011-01-20 12:53:19 Re: bug in SignalSomeChildren
Previous Message Magnus Hagander 2011-01-20 12:41:18 Re: Include WAL in base backup