pgsql: Add --synchronous option to pg_receivexlog, for more reliable WA

Lists: pgsql-committerspgsql-hackers
From: Fujii Masao <fujii(at)postgresql(dot)org>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Add --synchronous option to pg_receivexlog, for more reliable WA
Date: 2014-11-17 17:34:10
Message-ID: E1XqQBm-0007AJ-SU@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Add --synchronous option to pg_receivexlog, for more reliable WAL writing.

Previously pg_receivexlog flushed WAL data only when WAL file was switched.
Then 3dad73e added -F option to pg_receivexlog so that users could control
how frequently sync commands were issued to WAL files. It also allowed users
to make pg_receivexlog flush WAL data immediately after writing by
specifying 0 in -F option. However feedback messages were not sent back
immediately even after a flush location was updated. So even if WAL data
was flushed in real time, the server could not see that for a while.

This commit removes -F option from and adds --synchronous to pg_receivexlog.
If --synchronous is specified, like the standby's wal receiver, pg_receivexlog
flushes WAL data as soon as there is WAL data which has not been flushed yet.
Then it sends back the feedback message identifying the latest flush location
to the server. This option is useful to make pg_receivexlog behave as sync
standby by using replication slot, for example.

Original patch by Furuya Osamu, heavily rewritten by me.
Reviewed by Heikki Linnakangas, Alvaro Herrera and Sawada Masahiko.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/c4f99d20294950576d552dcaf9ce5b9bdc4233a3

Modified Files
--------------
doc/src/sgml/ref/pg_receivexlog.sgml | 42 ++++++++++++-----------
src/bin/pg_basebackup/pg_basebackup.c | 2 +-
src/bin/pg_basebackup/pg_receivexlog.c | 23 +++++--------
src/bin/pg_basebackup/receivelog.c | 57 ++++++++++++--------------------
src/bin/pg_basebackup/receivelog.h | 2 +-
5 files changed, 54 insertions(+), 72 deletions(-)


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Fujii Masao <fujii(at)postgresql(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Add --synchronous option to pg_receivexlog, for more reliable WA
Date: 2015-09-16 02:25:09
Message-ID: 55F8D305.9040506@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 11/17/14 12:34 PM, Fujii Masao wrote:
> Add --synchronous option to pg_receivexlog, for more reliable WAL writing.

The last two sentences of this piece of documentation are a bit
hand-wavy and hard to parse. Could you clarify this?

<varlistentry>
<term><option>-S <replaceable>slotname</replaceable></option></term>
<term><option>--slot=<replaceable class="parameter">slotname</replaceable></option></term>
<listitem>
<para>
Require <application>pg_receivexlog</application> to use an existing
replication slot (see <xref linkend="streaming-replication-slots">).
When this option is used, <application>pg_receivexlog</> will report
a flush position to the server, indicating when each segment has been
synchronized to disk so that the server can remove that segment if it
is not otherwise needed. <literal>--synchronous</literal> option must
be specified when making <application>pg_receivexlog</> run as
synchronous standby by using replication slot. Otherwise WAL data
cannot be flushed frequently enough for this to work correctly.
</para>
</listitem>
</varlistentry>


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Fujii Masao <fujii(at)postgresql(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Add --synchronous option to pg_receivexlog, for more reliable WA
Date: 2015-09-16 03:04:43
Message-ID: CAHGQGwG2RBc8t=VGPSpZLDg67SPqZw5PwcP1jhPXXnHS8msVjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Wed, Sep 16, 2015 at 11:25 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On 11/17/14 12:34 PM, Fujii Masao wrote:
>> Add --synchronous option to pg_receivexlog, for more reliable WAL writing.
>
> The last two sentences of this piece of documentation are a bit
> hand-wavy and hard to parse. Could you clarify this?

I think that what those sentences try to point is; to make pg_receivexlog run
as synchronous standby expectedly, both --slot and --synchronous options need
to be specified.

If --slot option is specified, pg_receivexlog reports a flush position to
the server even though --synchronous is not specified. So users might think
that --synchrnous option is not necessary for synchronous pg_receivexlog
setting. But that's not true. If --synchronous option is not specified, the
received WAL data is flushed to the disk only when WAL segment is switched.
So the transactions on the master need to wait for a long time, i.e.,
we can think that synchronous pg_receivexlog doesn't work smoothly.
To avoid such situation, --synchronous option also needs to be specified and
which makes pg_receivexlog flush WAL data immediately after receiving it.

Regards,

--
Fujii Masao


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Add --synchronous option to pg_receivexlog, for more reliable WA
Date: 2015-11-19 19:39:41
Message-ID: 564E257D.5000307@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 9/15/15 11:04 PM, Fujii Masao wrote:
> If --slot option is specified, pg_receivexlog reports a flush position to
> the server even though --synchronous is not specified. So users might think
> that --synchrnous option is not necessary for synchronous pg_receivexlog
> setting. But that's not true. If --synchronous option is not specified, the
> received WAL data is flushed to the disk only when WAL segment is switched.
> So the transactions on the master need to wait for a long time, i.e.,
> we can think that synchronous pg_receivexlog doesn't work smoothly.
> To avoid such situation, --synchronous option also needs to be specified and
> which makes pg_receivexlog flush WAL data immediately after receiving it.

Thank you for this information. I hope to have clarified this in the
documentation now.