Re: Patch for fail-back without fresh backup

From: Samrat Revagade <revagade(dot)samrat(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Patch for fail-back without fresh backup
Date: 2013-09-12 07:00:49
Message-ID: CAF8Q-GyrB94610MtoT92aJQfrT39soOVwd-ZiStfXLFVvwC_2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> We are improving the patch for Commit Fest 2 now.
> We will fix above compiler warnings as soon as possible and submit the
> patch

Attached *synchronous_transfer_v5.patch* implements review comments from
commit fest-1 and reduces the performance overhead of synchronous_transfer.

*synchronous_transfer_documentation_v1.patch* adds fail back safe standby
mechanism in the PostgreSQL documentation.

Sawada-san worked very hard to get this thing done, most of the work is
done by him. I really appreciate his efforts :)

**Brief description of suggestions from commit fest -1 are as follows:

1) Fail back-safe standby is not an appropriate name [done - changed it to
synchronous_transfer].

2) Remove extra set of postgresql.conf parameters [ done - Now there is
only one additional postgresql.conf parameter *synchronous_transfer* which
controls the synchronous nature of WAL transfer].

3) Performance overhead measurement [ done- with fast transaction
workloads and large loads index builds ].

4) Put the SyncRepWaitForLSN inside XLogFlush [ Not the correct way - as
SyncRepWaitForLSN will go in critical section and any error ( it will do
network I/O and may also sleep) inside critical section leads to server
PANIC and restart ].

5) Old master's WAL ahead of new master's WAL - [ we overcome with this
by deleting all WAL files of old master details can be found here :
https://wiki.postgresql.org/wiki/Synchronous_Transfer]

**Changes to postgresql.conf to configure fail back safe standby:

1) Synchronous fail-back safe standby

synchronous_standby_names = <server name>

synchronous_transfer = all

2) Asynchronous fail-back safe standby

synchronous_standby_names = <server name>

synchronous_transfer = data_flush

3) Pure synchronous standby

synchronous_standby_names = <server name>

synchronous_transfer = commit

4) Pure asynchronous standby

synchronous_transfer = commit

**Restriction:

If multiple standby servers connect to the master, then the standby with
synchronous replication becomes failback safe standby.

for example: if there are 2 standby servers which connects to master server
(one is SYNC, another one is ASYNC) and synchronous_transfer is set 'all'.

Then SYNC standby becomes failback safe standby and master server will wait
only for SYNC standby server.

**Performance overhead of synchronous_transfer patch:

Tests are performed with pgbench benchmark with following configuration
options:

Transaction type: TPC-B (sort of)

Scaling factor: 300

Query mode: simple

Number of clients: 150

Number of threads: 1

Duration: 1800 s

Real time scenarios mostly based on fast transaction workloads for which
synchronous_transfer have negligible overhead.

** 1. Test for fast transaction workloads [measured w.r.t default
replication in PostgreSQL, pgbench benchmark - TPS value]:

a. On an average performance overhead caused by synchronous
standby: 0.0102 %.

b. On an average performance overhead caused by synchronous failback safe
standby: 0.2943 %.

c. On an average performance overhead caused by Asynchronous
standby: 0.04321 %.

d. On an average performance overhead caused by asynchronous failback safe
standby: 0.5141 %

**2. Test for large loads and index builds [measured w.r.t default
replication in PostgreSQL, pgbench benchmark (-i option) - time in seconds]:

a. On an average performance overhead caused by synchronous standby: 3.51
%.

b. On an average performance overhead caused by synchronous failback safe
standby: 14.88%.

c. On an average performance overhead caused by Asynchronous
standby: 0.4887%.

d. On an average performance overhead caused by asynchronous failback safe
standby: 10.19%

**TO-DO:

More discussion is needed regarding usefulness/need and priority on
following. any feedback is appreciated:

1. Support for multiple fail back safe standbys.

2. Current design of patch will wait forever for the failback safe standby
like Streaming replication.

3. Support for cascaded failback standby

---
Regards,

Samrat Revagade

Attachment Content-Type Size
synchronous_transfer_documentation_v1.patch application/octet-stream 10.8 KB
synchronous_transfer_v5.patch application/octet-stream 20.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yeb Havinga 2013-09-12 09:29:27 Possible memory leak with SQL function?
Previous Message Amit Kapila 2013-09-12 04:08:43 Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block