Re: Clean switchover

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Fujii Masao'" <masao(dot)fujii(at)gmail(dot)com>, "'PostgreSQL-development'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clean switchover
Date: 2013-06-12 04:41:59
Message-ID: 005601ce6727$2d5df1c0$8819d540$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, June 12, 2013 4:23 AM Fujii Masao wrote:
> Hi,
>
> In streaming replication, when we shutdown the master, walsender tries
> to send all the outstanding WAL records including the shutdown
> checkpoint record to the standby, and then to exit. This basically
> means that all the WAL records are fully synced between two servers
> after the clean shutdown of the master. So, after promoting the standby
> to new master, we can restart the stopped master as new standby without
> the need for a fresh backup from new master.
>
> But there is one problem: though walsender tries to send all the
> outstanding WAL records, it doesn't wait for them to be replicated to
> the standby. IOW, walsender closes the replication connection as soon
> as it sends WAL records.
> Then, before receiving all the WAL records, walreceiver can detect the
> closure of connection and exit. We cannot guarantee that there is no
> missing WAL in the standby after clean shutdown of the master. In this
> case, backup from new master is required when restarting the stopped
> master as new standby. I have experienced this case several times,
> especially when enabling WAL archiving.
>
> The attached patch fixes this problem. It just changes walsender so
> that it waits for all the outstanding WAL records to be replicated to
> the standby before closing the replication connection.
>
> You may be concerned the case where the standby gets stuck and the
> walsender keeps waiting for the reply from that standby. In this case,
> wal_sender_timeout detects such inactive standby and then walsender
> ends. So even in that case, the shutdown can end.

Do you think it can impact time to complete shutdown?
After completing shutdown, user will promote standby to master, so if there
is delay in shutdown, it can cause delay in switchover.

With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2013-06-12 04:44:50 Re: Parallell Optimizer
Previous Message Craig Ringer 2013-06-12 04:13:17 Re: JSON and unicode surrogate pairs