Updated version of pg_receivexlog

Lists: pgsql-hackers
From: Magnus Hagander <magnus(at)hagander(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Updated version of pg_receivexlog
Date: 2011-08-16 14:32:55
Message-ID: CABUevExQ5CeBKvPaaO0Co+hnOoJmGW3e859dWntgj1uLe94swg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Here's an updated version of pg_receivexlog, that should now actually
work (it previously failed miserably when a replication record crossed
a WAL file boundary - something which I at the time could not properly
reproduce, but when I restarted my work on it now could easily
reproduce every time :D).

It also contains an update to pg_basebackup that allows it to stream
the transaction log in the background while the backup is running,
thus reducing the need for wal_keep_segments (if the client can keep
up, it should eliminate the need completely).

In doing so, it moves a number of functions from pg_basebackup.c to
the new file streamutil.c, to be shared between both pg_basebackup and
pg_receivexlog.

So far at least, it's completely client-side, with no changes to the
server. This means that it can be dropped into src/bin on 9.1 as well
to get a version that runs there (since we're way way way past feature
freeze and can't actually stick it in there in the official tree)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Attachment Content-Type Size
pg_receivexlog.diff text/x-patch 64.7 KB

From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-09-28 06:38:47
Message-ID: CAJKUy5hg-MiYjFbYnLoGy8LT21bYOxfA1_WWM0BgKO_fqQaC+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Aug 16, 2011 at 9:32 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> Here's an updated version of pg_receivexlog, that should now actually
> work (it previously failed miserably when a replication record crossed
> a WAL file boundary - something which I at the time could not properly
> reproduce, but when I restarted my work on it now could easily
> reproduce every time :D).
>
> It also contains an update to pg_basebackup that allows it to stream
> the transaction log in the background while the backup is running,
> thus reducing the need for wal_keep_segments (if the client can keep
> up, it should eliminate the need completely).
>

reviewing this...

i found useful pg_receivexlog as an independent utility, but i'm not
so sure about the ability to call it from pg_basebackup via --xlog
option. this is because pg_receivexlog will continue streaming even
after pg_basebackup if it's called independently but not in the other
case so the use case for --xlog seems more narrow and error prone (ie:
you said that it reduces the need for wal_keep_segments *if the client
can keep up*... how can we know that before starting pg_basebackup?)

pg_receivexlog worked good in my tests.

pg_basebackup with --xlog=stream gives me an already recycled wal
segment message (note that the file was in pg_xlog in the standby):
FATAL: could not receive data from WAL stream: FATAL: requested WAL
segment 00000001000000000000005C has already been removed

haven't read all the code in the detail but seems fine to me

in other things:
do we need to include src/bin/pg_basebackup/.gitignore in the patch?

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-09-28 07:30:13
Message-ID: CAJKUy5hwjuTmPcwML+y4noAoAAdx_SFnyc4iJK_S=g1GjL=VeQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Sep 28, 2011 at 1:38 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Tue, Aug 16, 2011 at 9:32 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> Here's an updated version of pg_receivexlog, that should now actually
>> work (it previously failed miserably when a replication record crossed
>> a WAL file boundary - something which I at the time could not properly
>> reproduce, but when I restarted my work on it now could easily
>> reproduce every time :D).
>>
>> It also contains an update to pg_basebackup that allows it to stream
>> the transaction log in the background while the backup is running,
>> thus reducing the need for wal_keep_segments (if the client can keep
>> up, it should eliminate the need completely).
>>
>
> reviewing this...
>

btw, executing 'make world' with this patch gives me this error (seems
like an entry is missing in doc/src/sgml/ref/allfiles.sgml):

jade:reference.sgml:223:4:E: general entity "pgReceivexlog" not
defined and no default entity

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-09-28 17:50:45
Message-ID: CABUevEy1V08ZPPhi1eECz+f0UoOSaBV48s92M3g9WgNvO6UEsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Sep 28, 2011 at 08:38, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Tue, Aug 16, 2011 at 9:32 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> Here's an updated version of pg_receivexlog, that should now actually
>> work (it previously failed miserably when a replication record crossed
>> a WAL file boundary - something which I at the time could not properly
>> reproduce, but when I restarted my work on it now could easily
>> reproduce every time :D).
>>
>> It also contains an update to pg_basebackup that allows it to stream
>> the transaction log in the background while the backup is running,
>> thus reducing the need for wal_keep_segments (if the client can keep
>> up, it should eliminate the need completely).
>>
>
> reviewing this...
>
> i found useful pg_receivexlog as an independent utility, but i'm not
> so sure about the ability to call it from pg_basebackup via --xlog
> option. this is because pg_receivexlog will continue streaming even
> after pg_basebackup if it's called independently but not in the other
> case so the use case for --xlog seems more narrow and error prone (ie:
> you said that it reduces the need for wal_keep_segments *if the client
> can keep up*... how can we know that before starting pg_basebackup?)

These two are not intended to be used together.

pg_basebackup --xlog=stream is intended for the same use-case as
"pg_basebackup -x" today, which is take a backup of just the parts
that you actually need to clone the database, but to do so without
having to guestimate the value for wal_keep_segments.

> pg_receivexlog worked good in my tests.
>
> pg_basebackup with --xlog=stream gives me an already recycled wal
> segment message (note that the file was in pg_xlog in the standby):
> FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
> segment 00000001000000000000005C has already been removed

Do you get this reproducibly? Or did you get it just once?

And when you say "in the standby" what are you referring to? There is
no standby server in the case of pg_basebackup --xlog=stream, it's
just backup... But are you saying pg_basebackup had received the file,
yet tried to get it again?

> in other things:
> do we need to include src/bin/pg_basebackup/.gitignore in the patch?

Not sure what you mean? We need to add pg_receivexlog to this file,
yes - in head it just contains pg_basebackup.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-09-28 17:56:18
Message-ID: CABUevEwiRd+Ez9mgf=muesoiKs52pFVrpEdQjSJCnR80hM+Yog@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Sep 28, 2011 at 09:30, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Wed, Sep 28, 2011 at 1:38 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>> On Tue, Aug 16, 2011 at 9:32 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>> Here's an updated version of pg_receivexlog, that should now actually
>>> work (it previously failed miserably when a replication record crossed
>>> a WAL file boundary - something which I at the time could not properly
>>> reproduce, but when I restarted my work on it now could easily
>>> reproduce every time :D).
>>>
>>> It also contains an update to pg_basebackup that allows it to stream
>>> the transaction log in the background while the backup is running,
>>> thus reducing the need for wal_keep_segments (if the client can keep
>>> up, it should eliminate the need completely).
>>>
>>
>> reviewing this...
>>
>
> btw, executing 'make world' with this patch gives me this error (seems
> like an entry is missing in doc/src/sgml/ref/allfiles.sgml):
>
> jade:reference.sgml:223:4:E: general entity "pgReceivexlog" not
> defined and no default entity

Ugh, how did I miss that. You need this:

diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index 8a8616b..382d297 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -172,6 +172,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY pgCtl SYSTEM "pg_ctl-ref.sgml">
<!ENTITY pgDump SYSTEM "pg_dump.sgml">
<!ENTITY pgDumpall SYSTEM "pg_dumpall.sgml">
+<!ENTITY pgReceivexlog SYSTEM "pg_receivexlog.sgml">
<!ENTITY pgResetxlog SYSTEM "pg_resetxlog.sgml">
<!ENTITY pgRestore SYSTEM "pg_restore.sgml">
<!ENTITY postgres SYSTEM "postgres-ref.sgml">

I think I broke it in a merge at some point..
--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-09-28 23:55:31
Message-ID: CAJKUy5gN-xfH9Lar2pjOombP0GYTajb0FQde9=uL3g6bKd5Wvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Sep 28, 2011 at 12:50 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>
>> pg_receivexlog worked good in my tests.
>>
>> pg_basebackup with --xlog=stream gives me an already recycled wal
>> segment message (note that the file was in pg_xlog in the standby):
>> FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
>> segment 00000001000000000000005C has already been removed
>
> Do you get this reproducibly? Or did you get it just once?
>
> And when you say "in the standby" what are you referring to? There is
> no standby server in the case of pg_basebackup --xlog=stream, it's
> just backup... But are you saying pg_basebackup had received the file,
> yet tried to get it again?
>

ok, i was trying to setup a standby server cloning with
pg_basebackup... i can't use it that way?

the docs says:
"""
If this option is specified, it is possible to start a postmaster
directly in the extracted directory without the need to consult the
log archive, thus making this a completely standalone backup.
"""

it doesn't say that is not possible to use this for a standby
server... probably that's why i get the error i put a recovery.conf
after pg_basebackup finished... maybe we can say that more loudly?

>
>> in other things:
>> do we need to include src/bin/pg_basebackup/.gitignore in the patch?
>
> Not sure what you mean? We need to add pg_receivexlog to this file,
> yes - in head it just contains pg_basebackup.
>

your patch includes a modification in the file
src/bin/pg_basebackup/.gitignore, maybe i'm just being annoying
besides is a simple change... just forget that...

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-09-29 20:30:32
Message-ID: CABUevEypmoWyWdnaP2AfOENTE9=dp+LjtO8GEV3K6bU_jmWFPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 29, 2011 at 01:55, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Wed, Sep 28, 2011 at 12:50 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>
>>> pg_receivexlog worked good in my tests.
>>>
>>> pg_basebackup with --xlog=stream gives me an already recycled wal
>>> segment message (note that the file was in pg_xlog in the standby):
>>> FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
>>> segment 00000001000000000000005C has already been removed
>>
>> Do you get this reproducibly? Or did you get it just once?
>>
>> And when you say "in the standby" what are you referring to? There is
>> no standby server in the case of pg_basebackup --xlog=stream, it's
>> just backup... But are you saying pg_basebackup had received the file,
>> yet tried to get it again?
>>
>
> ok, i was trying to setup a standby server cloning with
> pg_basebackup... i can't use it that way?
>
> the docs says:
> """
> If this option is specified, it is possible to start a postmaster
> directly in the extracted directory without the need to consult the
> log archive, thus making this a completely standalone backup.
> """
>
> it doesn't say that is not possible to use this for a standby
> server... probably that's why i get the error i put a recovery.conf
> after pg_basebackup finished... maybe we can say that  more loudly?

The idea is, if you use it with -x (or --xlog), it's for taking a
backup/clone, *not* for replication.

If you use it without -x, then you can use it as the start of a
replica, by adding a recovery.conf.

But you can't do both at once, that will confuse it.

>>> in other things:
>>> do we need to include src/bin/pg_basebackup/.gitignore in the patch?
>>
>> Not sure what you mean? We need to add pg_receivexlog to this file,
>> yes - in head it just contains pg_basebackup.
>>
>
> your patch includes a modification in the file
> src/bin/pg_basebackup/.gitignore, maybe i'm just being annoying
> besides is a simple change... just forget that...

Well, it needs to be included inthe commit, and if I exclude it inthe
posted patch, I'll just forget it in the end :-)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-24 11:46:59
Message-ID: 4EA55033.8010005@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> + /*
> + * Looks like an xlog file. Parse it's position.

s/it's/its/

> + */
> + if (sscanf(dirent->d_name, "%08X%08X%08X", &tli, &log, &seg) != 3)
> + {
> + fprintf(stderr, _("%s: could not parse xlog filename \"%s\"\n"),
> + progname, dirent->d_name);
> + disconnect_and_exit(1);
> + }
> + log *= XLOG_SEG_SIZE;

That multiplication by XLOG_SEG_SIZE could overflow, if logid is very
high. It seems completely unnecessary, anyway,

s/IDENFITY_SYSTEM/IDENTIFY_SYSTEM/ (two occurrences)

In pg_basebackup, it would be a good sanity check to check that the
systemid returned by IDENTIFY_SYSTEM in the main connection and the
WAL-streaming connection match. Just to be sure that some connection
pooler didn't hijack one of the connections and point to a different
server. And better check timelineid too while you're at it.

How does this interact with synchronous replication? If a base backup
that streams WAL is in progress, and you have synchronous_standby_names
set to '*', I believe the in-progress backup will count as a standby for
that purpose. That might give a false sense of security.
synchronous_standby_names='*' is prone to such confusion in general, but
it seems that it's particularly surprising if a running pg_basebackup
lets a commit in synchronous replication to proceed. Maybe we just need
a warning in the docs. I think we should advise that
synchronous_standby_names='*' is dangerous in general, and cite this as
one reason for that.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-24 12:40:36
Message-ID: CABUevEwtXWL12C+Qb0YNtUiWM4FV3wmEn1okqb-DNSb_bpNJcw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Oct 24, 2011 at 13:46, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> +               /*
>> +                * Looks like an xlog file. Parse it's position.
>
> s/it's/its/
>
>> +                */
>> +               if (sscanf(dirent->d_name, "%08X%08X%08X", &tli, &log,
>> &seg) != 3)
>> +               {
>> +                       fprintf(stderr, _("%s: could not parse xlog
>> filename \"%s\"\n"),
>> +                                       progname, dirent->d_name);
>> +                       disconnect_and_exit(1);
>> +               }
>> +               log *= XLOG_SEG_SIZE;
>
> That multiplication by XLOG_SEG_SIZE could overflow, if logid is very high.
> It seems completely unnecessary, anyway,

How do you mean completely unnecessary? We'd have to change the points
that use it to divide by XLOG_SEG_SIZE otherwise, no? That might be a
way to get around the overflow, but I'm not sure that's what you mean?

> s/IDENFITY_SYSTEM/IDENTIFY_SYSTEM/ (two occurrences)

Oops.

> In pg_basebackup, it would be a good sanity check to check that the systemid
> returned by IDENTIFY_SYSTEM in the main connection and the WAL-streaming
> connection match. Just to be sure that some connection pooler didn't hijack
> one of the connections and point to a different server. And better check
> timelineid too while you're at it.

That's a good idea. Will fix.

> How does this interact with synchronous replication? If a base backup that
> streams WAL is in progress, and you have synchronous_standby_names set to
> '*', I believe the in-progress backup will count as a standby for that
> purpose. That might give a false sense of security.

Ah yes. Did not think of that. Yes, it will have this problem.

> synchronous_standby_names='*' is prone to such confusion in general, but it
> seems that it's particularly surprising if a running pg_basebackup lets a
> commit in synchronous replication to proceed. Maybe we just need a warning
> in the docs. I think we should advise that synchronous_standby_names='*' is
> dangerous in general, and cite this as one reason for that.

Hmm. i think this is common enough that we want to make sure we avoid
it in code.

Could we pass a parameter from the client indicating to the master
that it refuses to be a sync slave? An optional keyword to the
START_REPLICATION command, perhaps?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-24 14:12:34
Message-ID: CAJKUy5ghp-T04Y5jVchBiSTgRJENZ=8JNrYbTMPf=_Qi6BivCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Oct 24, 2011 at 7:40 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>
>> synchronous_standby_names='*' is prone to such confusion in general, but it
>> seems that it's particularly surprising if a running pg_basebackup lets a
>> commit in synchronous replication to proceed. Maybe we just need a warning
>> in the docs. I think we should advise that synchronous_standby_names='*' is
>> dangerous in general, and cite this as one reason for that.
>
> Hmm. i think this is common enough that we want to make sure we avoid
> it in code.
>
> Could we pass a parameter from the client indicating to the master
> that it refuses to be a sync slave? An optional keyword to the
> START_REPLICATION command, perhaps?
>

can't you execute "set synchronous_commit to off/local" for this connection?

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-24 14:16:49
Message-ID: CABUevEzbr7WY8+8Z4McE10ELYxkv15jJhha2iNoZLsaL30Drsg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Oct 24, 2011 at 16:12, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Mon, Oct 24, 2011 at 7:40 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>
>>> synchronous_standby_names='*' is prone to such confusion in general, but it
>>> seems that it's particularly surprising if a running pg_basebackup lets a
>>> commit in synchronous replication to proceed. Maybe we just need a warning
>>> in the docs. I think we should advise that synchronous_standby_names='*' is
>>> dangerous in general, and cite this as one reason for that.
>>
>> Hmm. i think this is common enough that we want to make sure we avoid
>> it in code.
>>
>> Could we pass a parameter from the client indicating to the master
>> that it refuses to be a sync slave? An optional keyword to the
>> START_REPLICATION command, perhaps?
>>
>
> can't you execute "set synchronous_commit to off/local" for this connection?

This is a walsender connection, it doesn't take SQL. Plus it's the
receiving end, and SET sync_commit is for the sending end.

that said, we are reasonably safe in current implementations, because
it always sets the flush location to invalidxlogptr, so it will not be
considered for sync slave. Should we ever start accepting "write" as
the point to sync against, the problem will show up, of course.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-25 10:37:08
Message-ID: CABUevEzWrz1wG=jedGaenL_Y-osVejnZRjvBD-iD5QWwrwYzWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Oct 24, 2011 at 14:40, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Mon, Oct 24, 2011 at 13:46, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>> +               /*
>>> +                * Looks like an xlog file. Parse it's position.
>>
>> s/it's/its/
>>
>>> +                */
>>> +               if (sscanf(dirent->d_name, "%08X%08X%08X", &tli, &log,
>>> &seg) != 3)
>>> +               {
>>> +                       fprintf(stderr, _("%s: could not parse xlog
>>> filename \"%s\"\n"),
>>> +                                       progname, dirent->d_name);
>>> +                       disconnect_and_exit(1);
>>> +               }
>>> +               log *= XLOG_SEG_SIZE;
>>
>> That multiplication by XLOG_SEG_SIZE could overflow, if logid is very high.
>> It seems completely unnecessary, anyway,
>
> How do you mean completely unnecessary? We'd have to change the points
> that use it to divide by XLOG_SEG_SIZE otherwise, no? That might be a
> way to get around the overflow, but I'm not sure that's what you mean?

Talked to Heikki on IM about this one, turns out we were both wrong.
It's needed, but there was a bug hiding under it, due to (once again)
mixing up segments and offsets. Has been fixed now.

>> In pg_basebackup, it would be a good sanity check to check that the systemid
>> returned by IDENTIFY_SYSTEM in the main connection and the WAL-streaming
>> connection match. Just to be sure that some connection pooler didn't hijack
>> one of the connections and point to a different server. And better check
>> timelineid too while you're at it.
>
> That's a good idea. Will fix.

Added to the new version of the patch.

>> How does this interact with synchronous replication? If a base backup that
>> streams WAL is in progress, and you have synchronous_standby_names set to
>> '*', I believe the in-progress backup will count as a standby for that
>> purpose. That might give a false sense of security.
>
> Ah yes. Did not think of that. Yes, it will have this problem.

Actually, thinking more, per other mail, it won't. Because it will
never report that the data is synced to disk, so it will not be
considered for sync standby.

This is something we might consider in the future (it could be a
reasonable scenario where you had this), but not in the first version.

Updated version of the patch attached.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Attachment Content-Type Size
pg_receivexlog2.diff text/x-patch 67.1 KB

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-26 18:29:04
Message-ID: CABUevEw9vkcEKHuEpcC=VjEm8Cd3BccBnRetB-ujoLKk-n-gXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Oct 25, 2011 at 12:37, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Mon, Oct 24, 2011 at 14:40, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> On Mon, Oct 24, 2011 at 13:46, Heikki Linnakangas
>> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>>> +               /*
>>>> +                * Looks like an xlog file. Parse it's position.
>>>
>>> s/it's/its/
>>>
>>>> +                */
>>>> +               if (sscanf(dirent->d_name, "%08X%08X%08X", &tli, &log,
>>>> &seg) != 3)
>>>> +               {
>>>> +                       fprintf(stderr, _("%s: could not parse xlog
>>>> filename \"%s\"\n"),
>>>> +                                       progname, dirent->d_name);
>>>> +                       disconnect_and_exit(1);
>>>> +               }
>>>> +               log *= XLOG_SEG_SIZE;
>>>
>>> That multiplication by XLOG_SEG_SIZE could overflow, if logid is very high.
>>> It seems completely unnecessary, anyway,
>>
>> How do you mean completely unnecessary? We'd have to change the points
>> that use it to divide by XLOG_SEG_SIZE otherwise, no? That might be a
>> way to get around the overflow, but I'm not sure that's what you mean?
>
> Talked to Heikki on IM about this one, turns out we were both wrong.
> It's needed, but there was a bug hiding under it, due to (once again)
> mixing up segments and offsets. Has been fixed now.
>
>>> In pg_basebackup, it would be a good sanity check to check that the systemid
>>> returned by IDENTIFY_SYSTEM in the main connection and the WAL-streaming
>>> connection match. Just to be sure that some connection pooler didn't hijack
>>> one of the connections and point to a different server. And better check
>>> timelineid too while you're at it.
>>
>> That's a good idea. Will fix.
>
> Added to the new version of the patch.
>
>
>>> How does this interact with synchronous replication? If a base backup that
>>> streams WAL is in progress, and you have synchronous_standby_names set to
>>> '*', I believe the in-progress backup will count as a standby for that
>>> purpose. That might give a false sense of security.
>>
>> Ah yes. Did not think of that. Yes, it will have this problem.
>
> Actually, thinking more, per other mail, it won't. Because it will
> never report that the data is synced to disk, so it will not be
> considered for sync standby.
>
> This is something we might consider in the future (it could be a
> reasonable scenario where you had this), but not in the first version.
>
> Updated version of the patch attached.

I've applied this version with a few more minor changes that Heikki found.

His comment about .partial files still applies, and I intend to
address this in a follow-up commit, along with some further
documentation enhancements.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 07:29:21
Message-ID: CAHGQGwH1wF0oD5U3_X0-0AftxKLV6fPZUZphEm1quMeheLpd7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 27, 2011 at 3:29 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> I've applied this version with a few more minor changes that Heikki found.

Cool!

When I tried pg_receivexlog and checked the contents of streamed WAL file by
xlogdump, I found that recent WAL records that walsender has already sent don't
exist in that WAL file. I expected that pg_receivexlog writes the streamed WAL
records to the disk as soon as possible, but it doesn't. Is this
intentional? Or bug?
Am I missing something?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 07:40:20
Message-ID: CABUevEx35CKXMxQX1MfLprYhLEf7v8JGf-vVHpkBUy5aqXXYqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 27, 2011 at 09:29, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Thu, Oct 27, 2011 at 3:29 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> I've applied this version with a few more minor changes that Heikki found.
>
> Cool!
>
> When I tried pg_receivexlog and checked the contents of streamed WAL file by
> xlogdump, I found that recent WAL records that walsender has already sent don't
> exist in that WAL file. I expected that pg_receivexlog writes the streamed WAL
> records to the disk as soon as possible, but it doesn't. Is this
> intentional? Or bug?
> Am I missing something?

It writes it to disk as soon as possible, but doesn't fsync() until
the end of each segment. Are you by any chance looking at the file
while it's running?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 07:46:50
Message-ID: CAHGQGwF-SA3hZci5MHONG8FjRV4wJg4hW2Uac34fEnZFJYPOXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 27, 2011 at 4:40 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Thu, Oct 27, 2011 at 09:29, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> On Thu, Oct 27, 2011 at 3:29 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>> I've applied this version with a few more minor changes that Heikki found.
>>
>> Cool!
>>
>> When I tried pg_receivexlog and checked the contents of streamed WAL file by
>> xlogdump, I found that recent WAL records that walsender has already sent don't
>> exist in that WAL file. I expected that pg_receivexlog writes the streamed WAL
>> records to the disk as soon as possible, but it doesn't. Is this
>> intentional? Or bug?
>> Am I missing something?
>
> It writes it to disk as soon as possible, but doesn't fsync() until
> the end of each segment. Are you by any chance looking at the file
> while it's running?

No. I looked at that file after shutting down the master server.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 07:49:35
Message-ID: CABUevEyLYWTscvs4-wVvQ86p893vEJ5=buHMYhf9tsJKAbK4dA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 27, 2011 at 09:46, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Thu, Oct 27, 2011 at 4:40 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> On Thu, Oct 27, 2011 at 09:29, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>> On Thu, Oct 27, 2011 at 3:29 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>>> I've applied this version with a few more minor changes that Heikki found.
>>>
>>> Cool!
>>>
>>> When I tried pg_receivexlog and checked the contents of streamed WAL file by
>>> xlogdump, I found that recent WAL records that walsender has already sent don't
>>> exist in that WAL file. I expected that pg_receivexlog writes the streamed WAL
>>> records to the disk as soon as possible, but it doesn't. Is this
>>> intentional? Or bug?
>>> Am I missing something?
>>
>> It writes it to disk as soon as possible, but doesn't fsync() until
>> the end of each segment. Are you by any chance looking at the file
>> while it's running?
>
> No. I looked at that file after shutting down the master server.

Ugh, in that case something is certainly wrong. There is nothing but
setting up some offset values between PQgetCopyData() and write()...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 08:12:26
Message-ID: CAHGQGwH54sMZUWTX=n8CA05pXetAErv_QZ+nDmqJY85w5YfEOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 27, 2011 at 4:49 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Thu, Oct 27, 2011 at 09:46, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> On Thu, Oct 27, 2011 at 4:40 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>> On Thu, Oct 27, 2011 at 09:29, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>>> On Thu, Oct 27, 2011 at 3:29 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>>>> I've applied this version with a few more minor changes that Heikki found.
>>>>
>>>> Cool!
>>>>
>>>> When I tried pg_receivexlog and checked the contents of streamed WAL file by
>>>> xlogdump, I found that recent WAL records that walsender has already sent don't
>>>> exist in that WAL file. I expected that pg_receivexlog writes the streamed WAL
>>>> records to the disk as soon as possible, but it doesn't. Is this
>>>> intentional? Or bug?
>>>> Am I missing something?
>>>
>>> It writes it to disk as soon as possible, but doesn't fsync() until
>>> the end of each segment. Are you by any chance looking at the file
>>> while it's running?
>>
>> No. I looked at that file after shutting down the master server.
>
> Ugh, in that case something is certainly wrong. There is nothing but
> setting up some offset values between PQgetCopyData() and write()...

When end-of-copy stream is found or an error happens, pg_receivexlog
exits without flushing outstanding WAL records. Which seems to cause
the problem I reported.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 08:18:48
Message-ID: CABUevEzP1NocR1X+GOgS+X6bA4xaYtd7rRjHqWC3xOsGqSzRKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 27, 2011 at 10:12, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Thu, Oct 27, 2011 at 4:49 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> On Thu, Oct 27, 2011 at 09:46, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>> On Thu, Oct 27, 2011 at 4:40 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>>> On Thu, Oct 27, 2011 at 09:29, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>>>> On Thu, Oct 27, 2011 at 3:29 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>>>>> I've applied this version with a few more minor changes that Heikki found.
>>>>>
>>>>> Cool!
>>>>>
>>>>> When I tried pg_receivexlog and checked the contents of streamed WAL file by
>>>>> xlogdump, I found that recent WAL records that walsender has already sent don't
>>>>> exist in that WAL file. I expected that pg_receivexlog writes the streamed WAL
>>>>> records to the disk as soon as possible, but it doesn't. Is this
>>>>> intentional? Or bug?
>>>>> Am I missing something?
>>>>
>>>> It writes it to disk as soon as possible, but doesn't fsync() until
>>>> the end of each segment. Are you by any chance looking at the file
>>>> while it's running?
>>>
>>> No. I looked at that file after shutting down the master server.
>>
>> Ugh, in that case something is certainly wrong. There is nothing but
>> setting up some offset values between PQgetCopyData() and write()...
>
> When end-of-copy stream is found or an error happens, pg_receivexlog
> exits without flushing outstanding WAL records. Which seems to cause
> the problem I reported.

Not sure I follow. When we arrive at PQgetCopyData() there should be
nothing buffered, and if the end of stream happens there it returns
-1, and we exit, no? So where is the data that's lost?

I do realize we don't actually fsync() and close() in this case - is
that what you are referring to? But the data should already have been
write()d, so it should still be there, no?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 09:25:59
Message-ID: CAHGQGwG==wdDrD9w045k4umWRhSto5_Jsqr7jYmgjBAsHC8SQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 27, 2011 at 5:18 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> Not sure I follow. When we arrive at PQgetCopyData() there should be
> nothing buffered, and if the end of stream happens there it returns
> -1, and we exit, no? So where is the data that's lost?
>
> I do realize we don't actually fsync() and close() in this case - is
> that what you are referring to? But the data should already have been
> write()d, so it should still be there, no?

Oh, right. Hmm.. xlogdump might be the cause.

Though I've not read the code of xlogdump, I wonder if it gives up
outputting the contents of WAL file when it finds a partial WAL page...
This strikes me that recovery code has the same problem. No?
IOW, when a partial WAL page is found during recovery, I'm afraid
that page would not be replayed though it contains valid data.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 10:29:56
Message-ID: CAHGQGwH3xosr=+Pfrwho+rCsTKyM-BeqWfZXvBy2VUWWTa0hmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 27, 2011 at 6:25 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Thu, Oct 27, 2011 at 5:18 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> Not sure I follow. When we arrive at PQgetCopyData() there should be
>> nothing buffered, and if the end of stream happens there it returns
>> -1, and we exit, no? So where is the data that's lost?
>>
>> I do realize we don't actually fsync() and close() in this case - is
>> that what you are referring to? But the data should already have been
>> write()d, so it should still be there, no?
>
> Oh, right. Hmm.. xlogdump might be the cause.
>
> Though I've not read the code of xlogdump, I wonder if it gives up
> outputting the contents of WAL file when it finds a partial WAL page...
> This strikes me that recovery code has the same problem. No?
> IOW, when a partial WAL page is found during recovery, I'm afraid
> that page would not be replayed though it contains valid data.

My concern was right. When I performed a recovery using the streamed
WAL files, the loss of data happened. A partial WAL page was not replayed.

To avoid this problem, I think that we should change pg_receivexlog so
that it writes WAL data *by the block*, or creates, like walreceiver, WAL file
before writing any data. Otherwise, though pg_receivexlog streams WAL
data in realtime, the latest WAL data might not be available for recovery.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 10:48:42
Message-ID: CABUevExCS-iO+bs9imSK4SnEsaHVCmZZpH3J0Vj_ydXeBrfe_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 27, 2011 at 12:29, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Thu, Oct 27, 2011 at 6:25 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> On Thu, Oct 27, 2011 at 5:18 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>> Not sure I follow. When we arrive at PQgetCopyData() there should be
>>> nothing buffered, and if the end of stream happens there it returns
>>> -1, and we exit, no? So where is the data that's lost?
>>>
>>> I do realize we don't actually fsync() and close() in this case - is
>>> that what you are referring to? But the data should already have been
>>> write()d, so it should still be there, no?
>>
>> Oh, right. Hmm.. xlogdump might be the cause.
>>
>> Though I've not read the code of xlogdump, I wonder if it gives up
>> outputting the contents of WAL file when it finds a partial WAL page...
>> This strikes me that recovery code has the same problem. No?
>> IOW, when a partial WAL page is found during recovery, I'm afraid
>> that page would not be replayed though it contains valid data.
>
> My concern was right. When I performed a recovery using the streamed
> WAL files, the loss of data happened. A partial WAL page was not replayed.
>
> To avoid this problem, I think that we should change pg_receivexlog so
> that it writes WAL data *by the block*, or creates, like walreceiver, WAL file
> before writing any data. Otherwise, though pg_receivexlog streams WAL
> data in realtime, the latest WAL data might not be available for recovery.

Ah, so you were recovering data from the last, partial, file? Not from
a completed file?

I'm rewriting the handling of partial files per the other thread
started by Heikki. The idea is that there will be an actual .partial
file in there when pg_receivexlog has ended, and you have to deal with
it manually. The typical way would be to pad it with zeroes to the
end. Doing such padding would solve this recovery issue, correct?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 11:09:27
Message-ID: CAHGQGwHNXmse+krBQRwcwca6W8Z=DcKMY50ru-Yun0y8ZD5img@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 27, 2011 at 7:48 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Thu, Oct 27, 2011 at 12:29, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> On Thu, Oct 27, 2011 at 6:25 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>> On Thu, Oct 27, 2011 at 5:18 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>>> Not sure I follow. When we arrive at PQgetCopyData() there should be
>>>> nothing buffered, and if the end of stream happens there it returns
>>>> -1, and we exit, no? So where is the data that's lost?
>>>>
>>>> I do realize we don't actually fsync() and close() in this case - is
>>>> that what you are referring to? But the data should already have been
>>>> write()d, so it should still be there, no?
>>>
>>> Oh, right. Hmm.. xlogdump might be the cause.
>>>
>>> Though I've not read the code of xlogdump, I wonder if it gives up
>>> outputting the contents of WAL file when it finds a partial WAL page...
>>> This strikes me that recovery code has the same problem. No?
>>> IOW, when a partial WAL page is found during recovery, I'm afraid
>>> that page would not be replayed though it contains valid data.
>>
>> My concern was right. When I performed a recovery using the streamed
>> WAL files, the loss of data happened. A partial WAL page was not replayed.
>>
>> To avoid this problem, I think that we should change pg_receivexlog so
>> that it writes WAL data *by the block*, or creates, like walreceiver, WAL file
>> before writing any data. Otherwise, though pg_receivexlog streams WAL
>> data in realtime, the latest WAL data might not be available for recovery.
>
> Ah, so you were recovering data from the last, partial, file? Not from
> a completed file?

Yes. I copied all streamed WAL files to pg_xlog directory and started recovery.

> I'm rewriting the handling of partial files per the other thread
> started by Heikki. The idea is that there will be an actual .partial
> file in there when pg_receivexlog has ended, and you have to deal with
> it manually. The typical way would be to pad it with zeroes to the
> end. Doing such padding would solve this recovery issue, correct?

Yes. But that sounds unuserfriendly. Padding the WAL file manually
is easy-to-do for a user?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 11:19:04
Message-ID: 4EA93E28.1000004@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27.10.2011 14:09, Fujii Masao wrote:
> On Thu, Oct 27, 2011 at 7:48 PM, Magnus Hagander<magnus(at)hagander(dot)net> wrote:
>> I'm rewriting the handling of partial files per the other thread
>> started by Heikki. The idea is that there will be an actual .partial
>> file in there when pg_receivexlog has ended, and you have to deal with
>> it manually. The typical way would be to pad it with zeroes to the
>> end. Doing such padding would solve this recovery issue, correct?
>
> Yes. But that sounds unuserfriendly. Padding the WAL file manually
> is easy-to-do for a user?

"truncate -s 16M <file>" works at least on my Linux system. Not sure how
you'd do it on Windows.

Perhaps we should add automatic padding in the server, though. It
wouldn't take much code in the server, and would make life easier for
people writing their scripts. Maybe even have the backend check for a
.partial file if it can't find a regularly named XLOG file. Needs some
thought..

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 11:23:24
Message-ID: CABUevEwv8gEp2EddQJs=n5W0ot0UxmYG5nni2XkQ6KCd6i9beA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 27, 2011 at 13:19, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> On 27.10.2011 14:09, Fujii Masao wrote:
>>
>> On Thu, Oct 27, 2011 at 7:48 PM, Magnus Hagander<magnus(at)hagander(dot)net>
>>  wrote:
>>>
>>> I'm rewriting the handling of partial files per the other thread
>>> started by Heikki. The idea is that there will be an actual .partial
>>> file in there when pg_receivexlog has ended, and you have to deal with
>>> it manually. The typical way would be to pad it with zeroes to the
>>> end. Doing such padding would solve this recovery issue, correct?
>>
>> Yes. But that sounds unuserfriendly. Padding the WAL file manually
>> is easy-to-do for a user?
>
> "truncate -s 16M <file>" works at least on my Linux system. Not sure how
> you'd do it on Windows.

Yeah, taht's easy enough. I don't think there are similar tools on
windows, but we could probably put together a quick script for people
to use if necessary.

> Perhaps we should add automatic padding in the server, though. It wouldn't
> take much code in the server, and would make life easier for people writing
> their scripts. Maybe even have the backend check for a .partial file if it
> can't find a regularly named XLOG file. Needs some thought..

I'd definitely want to avoid anything that requires pg_receivexlog to
actually *parse* the WAL. That'll make it way more complex than I'd
like.

Having recovery consider a .partial file might be interesting. It
could consider that only if there are no other complete files
available, or something like that?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 12:00:08
Message-ID: CA+Tgmobvk5psaCnY54oc_X7iK9ZNW=XM-yC9khBcjaGAfhtofA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 27, 2011 at 7:19 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> On 27.10.2011 14:09, Fujii Masao wrote:
>>
>> On Thu, Oct 27, 2011 at 7:48 PM, Magnus Hagander<magnus(at)hagander(dot)net>
>>  wrote:
>>>
>>> I'm rewriting the handling of partial files per the other thread
>>> started by Heikki. The idea is that there will be an actual .partial
>>> file in there when pg_receivexlog has ended, and you have to deal with
>>> it manually. The typical way would be to pad it with zeroes to the
>>> end. Doing such padding would solve this recovery issue, correct?
>>
>> Yes. But that sounds unuserfriendly. Padding the WAL file manually
>> is easy-to-do for a user?
>
> "truncate -s 16M <file>" works at least on my Linux system. Not sure how
> you'd do it on Windows.

One of the common I hear about PostgreSQL is that our replication
system is more difficult to set up than people would like, and it's
too easy to make mistakes that can corrupt your data without realizing
it; I don't think making them need to truncate a file to 16 megabytes
is going to improve things there.

> Perhaps we should add automatic padding in the server, though. It wouldn't
> take much code in the server, and would make life easier for people writing
> their scripts. Maybe even have the backend check for a .partial file if it
> can't find a regularly named XLOG file. Needs some thought..

+1 for figuring out something, though I'm not sure exactly what.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 14:54:00
Message-ID: 28587.1319727240@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus Hagander <magnus(at)hagander(dot)net> writes:
> On Thu, Oct 27, 2011 at 13:19, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> On 27.10.2011 14:09, Fujii Masao wrote:
>>> Yes. But that sounds unuserfriendly. Padding the WAL file manually
>>> is easy-to-do for a user?

> I'd definitely want to avoid anything that requires pg_receivexlog to
> actually *parse* the WAL. That'll make it way more complex than I'd
> like.

What parsing? Just pad to 16MB with zeroes. In fact, I think the
receiver should just create the file that size to start with, and then
write received data into it, much like normal WAL creation does.

regards, tom lane


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 14:57:22
Message-ID: CABUevEziv=ejN1XmQM6EWuc_4q=30dUZu2=PJtkajgfxQVF8yQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 27, 2011 at 16:54, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Magnus Hagander <magnus(at)hagander(dot)net> writes:
>> On Thu, Oct 27, 2011 at 13:19, Heikki Linnakangas
>> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>> On 27.10.2011 14:09, Fujii Masao wrote:
>>>> Yes. But that sounds unuserfriendly. Padding the WAL file manually
>>>> is easy-to-do for a user?
>
>> I'd definitely want to avoid anything that requires pg_receivexlog to
>> actually *parse* the WAL. That'll make it way more complex than I'd
>> like.
>
> What parsing?  Just pad to 16MB with zeroes.  In fact, I think the

I'm just sayihng that *if* parsing is required, it would be bad.

> receiver should just create the file that size to start with, and then
> write received data into it, much like normal WAL creation does.

So when pg_receivexlog starts up, how would it know if the last file
represents a completed file, or a half-full file, without actually
parsing it? It could be a 16Mb file with 10 bytes of valid data, or a
complete file with 16Mb of valid data.

We could always ask for a retransmit of the whole file, but if that
file is gone on the master, we won't be able to do that, and will
error out in a situation that's not actually an error.

Though I guess if we leave the file as .partial up until this point
(per my other patch just posted), I guess this doesn't actually apply
- if the file is called .partial, we'll overwrite into it. If it's
not, then we assume it's a complete segment.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-27 15:14:14
Message-ID: m2ehxyzcu1.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus Hagander <magnus(at)hagander(dot)net> writes:
> On Thu, Oct 27, 2011 at 13:19, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> Perhaps we should add automatic padding in the server, though. It wouldn't
>> take much code in the server, and would make life easier for people writing
>> their scripts. Maybe even have the backend check for a .partial file if it
>> can't find a regularly named XLOG file. Needs some thought..
>
> I'd definitely want to avoid anything that requires pg_receivexlog to
> actually *parse* the WAL. That'll make it way more complex than I'd
> like.

What about creating the WAL file filled up with zeroes at the receiving
end and then overwriting data as we receive it?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2011-10-28 06:56:46
Message-ID: CAHGQGwGUT=bPb2VXyy_VbCMMdr-jm-FJeeW4gQ8gj+swvgQCVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 27, 2011 at 11:57 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Thu, Oct 27, 2011 at 16:54, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Magnus Hagander <magnus(at)hagander(dot)net> writes:
>>> On Thu, Oct 27, 2011 at 13:19, Heikki Linnakangas
>>> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>>> On 27.10.2011 14:09, Fujii Masao wrote:
>>>>> Yes. But that sounds unuserfriendly. Padding the WAL file manually
>>>>> is easy-to-do for a user?
>>
>>> I'd definitely want to avoid anything that requires pg_receivexlog to
>>> actually *parse* the WAL. That'll make it way more complex than I'd
>>> like.
>>
>> What parsing?  Just pad to 16MB with zeroes.  In fact, I think the
>
> I'm just sayihng that *if* parsing is required, it would be bad.
>
>> receiver should just create the file that size to start with, and then
>> write received data into it, much like normal WAL creation does.
>
> So when pg_receivexlog starts up, how would it know if the last file
> represents a completed file, or a half-full file, without actually
> parsing it? It could be a 16Mb file with 10 bytes of valid data, or a
> complete file with 16Mb of valid data.
>
> We could always ask for a retransmit of the whole file, but if that
> file is gone on the master, we won't be able to do that, and will
> error out in a situation that's not actually an error.
>
> Though I guess if we leave the file as .partial up until this point
> (per my other patch just posted), I guess this doesn't actually apply
> - if the file is called .partial, we'll overwrite into it. If it's
> not, then we assume it's a complete segment.

Yeah, I think that we should commit the patch that you posted in
other thread, and should change pg_receivexlog so that it creates
new WAL file filled up with zero or opens a pre-existing one, like
XLogFileInit() does, before writing any streamed data. If we do
this, a user can easily use a partial WAL file for recovery by
renaming that file.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Ants Aasma <ants(at)cybertec(dot)at>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2012-06-04 14:25:07
Message-ID: CA+CSw_s4gAm=hQHANwDyABztfJ_qdS1mQDeHLtLv8AHtsosE=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 29, 2011 at 11:30 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> it doesn't say that is not possible to use this for a standby
>> server... probably that's why i get the error i put a recovery.conf
>> after pg_basebackup finished... maybe we can say that  more loudly?
>
> The idea is, if you use it with -x (or --xlog), it's for taking a
> backup/clone, *not* for replication.
>
> If you use it without -x, then you can use it as the start of a
> replica, by adding a recovery.conf.
>
> But you can't do both at once, that will confuse it.

I stumbled upon this again today. There's nothing in the docs that
would even hint that using -x shouldn't work to create a replica. Why
does it get confused and can we (easily) make it not get confused? At
the very least it needs a big fat warning in documentation for the -x
option that the resulting backup might not be usable as a standby.

Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Ants Aasma <ants(at)cybertec(dot)at>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2012-06-04 15:20:14
Message-ID: CAHGQGwGTAtp0YPE3--V5XP+BoJEGWGSGfixDTPjrX_gmg7w2uw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 4, 2012 at 11:25 PM, Ants Aasma <ants(at)cybertec(dot)at> wrote:
> On Thu, Sep 29, 2011 at 11:30 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>> it doesn't say that is not possible to use this for a standby
>>> server... probably that's why i get the error i put a recovery.conf
>>> after pg_basebackup finished... maybe we can say that  more loudly?
>>
>> The idea is, if you use it with -x (or --xlog), it's for taking a
>> backup/clone, *not* for replication.
>>
>> If you use it without -x, then you can use it as the start of a
>> replica, by adding a recovery.conf.
>>
>> But you can't do both at once, that will confuse it.
>
> I stumbled upon this again today. There's nothing in the docs that
> would even hint that using -x shouldn't work to create a replica. Why
> does it get confused and can we (easily) make it not get confused? At
> the very least it needs a big fat warning in documentation for the -x
> option that the resulting backup might not be usable as a standby.

Unless I'm missing something, you can use pg_basebackup -x for the
standby. If lots of WAL files are generated in the master after
pg_basebackup -x ends and before you start the standby instance,
you may get the following error. In this case, you need to consult with
archived WAL files even though you specified -x option in pg_basebackup.

> FATAL: could not receive data from WAL stream: FATAL: requested WAL
> segment 00000001000000000000005C has already been removed

Though we have the above problem, pg_basebackup -x is usable for
the standby, I think.

Regards,

--
Fujii Masao


From: Ants Aasma <ants(at)cybertec(dot)at>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2012-06-04 15:48:40
Message-ID: CA+CSw_vZNx0oV+44qHEqNsyZqFihRCuAj1hodqdQHuNDSywR2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 4, 2012 at 6:20 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Mon, Jun 4, 2012 at 11:25 PM, Ants Aasma <ants(at)cybertec(dot)at> wrote:
>> On Thu, Sep 29, 2011 at 11:30 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>>> it doesn't say that is not possible to use this for a standby
>>>> server... probably that's why i get the error i put a recovery.conf
>>>> after pg_basebackup finished... maybe we can say that  more loudly?
>>>
>>> The idea is, if you use it with -x (or --xlog), it's for taking a
>>> backup/clone, *not* for replication.
>>>
>>> If you use it without -x, then you can use it as the start of a
>>> replica, by adding a recovery.conf.
>>>
>>> But you can't do both at once, that will confuse it.
>>
>> I stumbled upon this again today. There's nothing in the docs that
>> would even hint that using -x shouldn't work to create a replica. Why
>> does it get confused and can we (easily) make it not get confused? At
>> the very least it needs a big fat warning in documentation for the -x
>> option that the resulting backup might not be usable as a standby.
>
> Unless I'm missing something, you can use pg_basebackup -x for the
> standby. If lots of WAL files are generated in the master after
> pg_basebackup -x ends and before you start the standby instance,
> you may get the following error. In this case, you need to consult with
> archived WAL files even though you specified -x option in pg_basebackup.
>
>> FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
>> segment 00000001000000000000005C has already been removed
>
> Though we have the above problem, pg_basebackup -x is usable for
> the standby, I think.

I assumed from Magnus's comment that this is a known problem. I wonder
what went wrong if it should have worked. In the case where this
turned up the missing file was an xlog file with the new timeline ID
but one segment before the timeline switch. I'll have to see if I can
create a reproducible case for this.

Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Ants Aasma <ants(at)cybertec(dot)at>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2012-06-04 15:53:02
Message-ID: CABUevEyk7LzciGae1Q=9qzDhscWr-11OpsactA3YnicRdybY+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 4, 2012 at 5:48 PM, Ants Aasma <ants(at)cybertec(dot)at> wrote:
> On Mon, Jun 4, 2012 at 6:20 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> On Mon, Jun 4, 2012 at 11:25 PM, Ants Aasma <ants(at)cybertec(dot)at> wrote:
>>> On Thu, Sep 29, 2011 at 11:30 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>>>>> it doesn't say that is not possible to use this for a standby
>>>>> server... probably that's why i get the error i put a recovery.conf
>>>>> after pg_basebackup finished... maybe we can say that  more loudly?
>>>>
>>>> The idea is, if you use it with -x (or --xlog), it's for taking a
>>>> backup/clone, *not* for replication.
>>>>
>>>> If you use it without -x, then you can use it as the start of a
>>>> replica, by adding a recovery.conf.
>>>>
>>>> But you can't do both at once, that will confuse it.
>>>
>>> I stumbled upon this again today. There's nothing in the docs that
>>> would even hint that using -x shouldn't work to create a replica. Why
>>> does it get confused and can we (easily) make it not get confused? At
>>> the very least it needs a big fat warning in documentation for the -x
>>> option that the resulting backup might not be usable as a standby.
>>
>> Unless I'm missing something, you can use pg_basebackup -x for the
>> standby. If lots of WAL files are generated in the master after
>> pg_basebackup -x ends and before you start the standby instance,
>> you may get the following error. In this case, you need to consult with
>> archived WAL files even though you specified -x option in pg_basebackup.
>>
>>> FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
>>> segment 00000001000000000000005C has already been removed
>>
>> Though we have the above problem, pg_basebackup -x is usable for
>> the standby, I think.
>
> I assumed from Magnus's comment that this is a known problem. I wonder
> what went wrong if it should have worked. In the case where this
> turned up the missing file was an xlog file with the new timeline ID
> but one segment before the timeline switch. I'll have to see if I can
> create a reproducible case for this.

No, it's more a "there's no reason to do that". I don't think it
should necessarily be an actual problem.

In your case the missing piece of information is why was there a
timeline switch? pg_basebackup shouldn't cause a timeline switch
whether you use it in -x mode or not.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Ants Aasma <ants(at)cybertec(dot)at>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated version of pg_receivexlog
Date: 2012-06-04 16:06:48
Message-ID: CA+CSw_swtpx8=d-+ew3WZR4teX8nohgJ5QFuheJm3xT=4vVG7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 4, 2012 at 6:53 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> No, it's more a "there's no reason to do that". I don't think it
> should necessarily be an actual problem.

Ok, good to know.

> In your case the missing piece of information is why was there a
> timeline switch? pg_basebackup shouldn't cause a timeline switch
> whether you use it in -x mode or not.

No mystery there. The timeline switch was because I had just promoted
the master for standby mode. There's a chance I might have
accidentally done something horribly wrong somewhere because I can't
immediately reproduce this. I'll let you know if I find out how I
managed to create this error.

Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de