Quick Links

Re: Clean switchover

Lists:	pgsql-hackers

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Clean switchover
Date:	2013-06-11 22:53:29
Message-ID:	CAHGQGwHLjEROTMtSWJd=xg_VFwRe3oJWnTYsyBDUbRYa6rr0DQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

In streaming replication, when we shutdown the master, walsender tries to
send all the outstanding WAL records including the shutdown checkpoint
record to the standby, and then to exit. This basically means that all the
WAL records are fully synced between two servers after the clean shutdown
of the master. So, after promoting the standby to new master, we can
restart the stopped master as new standby without the need for a fresh
backup from new master.

But there is one problem: though walsender tries to send all the outstanding
WAL records, it doesn't wait for them to be replicated to the standby. IOW,
walsender closes the replication connection as soon as it sends WAL records.
Then, before receiving all the WAL records, walreceiver can detect
the closure of connection and exit. We cannot guarantee that there is no
missing WAL in the standby after clean shutdown of the master. In this case,
backup from new master is required when restarting the stopped master as
new standby. I have experienced this case several times, especially when
enabling WAL archiving.

The attached patch fixes this problem. It just changes walsender so that it
waits for all the outstanding WAL records to be replicated to the standby
before closing the replication connection.

You may be concerned the case where the standby gets stuck and the
walsender keeps waiting for the reply from that standby. In this case,
wal_sender_timeout detects such inactive standby and then walsender
ends. So even in that case, the shutdown can end.

Thought?

Regards,

--
Fujii Masao

Attachment	Content-Type	Size
switchover_v1.patch	application/octet-stream	2.8 KB

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Clean switchover
Date:	2013-06-12 01:15:20
Message-ID:	20130612011520.GW7200@tamriel.snowman.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

* Fujii Masao (masao(dot)fujii(at)gmail(dot)com) wrote:
> The attached patch fixes this problem. It just changes walsender so that it
> waits for all the outstanding WAL records to be replicated to the standby
> before closing the replication connection.

Seems like a good idea to me.. Rather surprised that we're not doing
this already, to be honest.

Thanks,

Stephen

From:	Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Clean switchover
Date:	2013-06-12 03:55:21
Message-ID:	51B7F129.5080700@catalyst.net.nz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 12/06/13 13:15, Stephen Frost wrote:
> * Fujii Masao (masao(dot)fujii(at)gmail(dot)com) wrote:
>> The attached patch fixes this problem. It just changes walsender so that it
>> waits for all the outstanding WAL records to be replicated to the standby
>> before closing the replication connection.
>
> Seems like a good idea to me.. Rather surprised that we're not doing
> this already, to be honest.
>

Yeah +1 from here too. This would make clean switchovers for (typically)
testing scenarios a lot less complex and resource intensive (rebuilding
of the old master as a slave when you know it is ok is despairing on a
huge db).

On the related note (but not actually to do with this patch),
clarifying/expanding the docs about the various methods for standby
promotion:

1/ trigger file creation
2/ pg_ctl promote
3/ renaming/removing recovery.conf

and the differences between them would be great. For instance I only
recently realized that method 3) means the promoted standby does not
start a new timeline (incidentally - could this be an option to pg_ctl
promote) which is very useful for (again) controlled/clean switchovers.

regards

Mark

From:	Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To:	"'Fujii Masao'" <masao(dot)fujii(at)gmail(dot)com>, "'PostgreSQL-development'" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Clean switchover
Date:	2013-06-12 04:41:59
Message-ID:	005601ce6727$2d5df1c0$8819d540$@kapila@huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wednesday, June 12, 2013 4:23 AM Fujii Masao wrote:
> Hi,
>
> In streaming replication, when we shutdown the master, walsender tries
> to send all the outstanding WAL records including the shutdown
> checkpoint record to the standby, and then to exit. This basically
> means that all the WAL records are fully synced between two servers
> after the clean shutdown of the master. So, after promoting the standby
> to new master, we can restart the stopped master as new standby without
> the need for a fresh backup from new master.
>
> But there is one problem: though walsender tries to send all the
> outstanding WAL records, it doesn't wait for them to be replicated to
> the standby. IOW, walsender closes the replication connection as soon
> as it sends WAL records.
> Then, before receiving all the WAL records, walreceiver can detect the
> closure of connection and exit. We cannot guarantee that there is no
> missing WAL in the standby after clean shutdown of the master. In this
> case, backup from new master is required when restarting the stopped
> master as new standby. I have experienced this case several times,
> especially when enabling WAL archiving.
>
> The attached patch fixes this problem. It just changes walsender so
> that it waits for all the outstanding WAL records to be replicated to
> the standby before closing the replication connection.
>
> You may be concerned the case where the standby gets stuck and the
> walsender keeps waiting for the reply from that standby. In this case,
> wal_sender_timeout detects such inactive standby and then walsender
> ends. So even in that case, the shutdown can end.

Do you think it can impact time to complete shutdown?
After completing shutdown, user will promote standby to master, so if there
is delay in shutdown, it can cause delay in switchover.

With Regards,
Amit Kapila.

From:	Magnus Hagander <magnus(at)hagander(dot)net>
To:	Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Clean switchover
Date:	2013-06-12 11:38:37
Message-ID:	CABUevEyT6xXtumW9mw9Gr7bu43K1+LLZK-jhi=YsWMGYRSLkaA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Jun 12, 2013 at 6:41 AM, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
> On Wednesday, June 12, 2013 4:23 AM Fujii Masao wrote:
>> Hi,
>>
>> In streaming replication, when we shutdown the master, walsender tries
>> to send all the outstanding WAL records including the shutdown
>> checkpoint record to the standby, and then to exit. This basically
>> means that all the WAL records are fully synced between two servers
>> after the clean shutdown of the master. So, after promoting the standby
>> to new master, we can restart the stopped master as new standby without
>> the need for a fresh backup from new master.
>>
>> But there is one problem: though walsender tries to send all the
>> outstanding WAL records, it doesn't wait for them to be replicated to
>> the standby. IOW, walsender closes the replication connection as soon
>> as it sends WAL records.
>> Then, before receiving all the WAL records, walreceiver can detect the
>> closure of connection and exit. We cannot guarantee that there is no
>> missing WAL in the standby after clean shutdown of the master. In this
>> case, backup from new master is required when restarting the stopped
>> master as new standby. I have experienced this case several times,
>> especially when enabling WAL archiving.
>>
>> The attached patch fixes this problem. It just changes walsender so
>> that it waits for all the outstanding WAL records to be replicated to
>> the standby before closing the replication connection.
>>
>> You may be concerned the case where the standby gets stuck and the
>> walsender keeps waiting for the reply from that standby. In this case,
>> wal_sender_timeout detects such inactive standby and then walsender
>> ends. So even in that case, the shutdown can end.
>
> Do you think it can impact time to complete shutdown?
> After completing shutdown, user will promote standby to master, so if there
> is delay in shutdown, it can cause delay in switchover.

I'd expect a controlled switchover to happen without dataloss. Yes,
this could make it take a bit longer time, but it guarantees you don't
loose data. ISTM that if you don't care about the potential dataloss,
you can just use a faster shutdown method (e.g. immediate)

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Clean switchover
Date:	2013-06-12 11:48:41
Message-ID:	20130612114841.GA6007@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
> The attached patch fixes this problem. It just changes walsender so that it
> waits for all the outstanding WAL records to be replicated to the standby
> before closing the replication connection.

Imo this is a fix that needs to get backpatched... The code tried to do
this but failed, I don't think it really gives grounds for valid *new*
concerns.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Magnus Hagander <magnus(at)hagander(dot)net>
To:	Andres Freund <andres(at)2ndquadrant(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Clean switchover
Date:	2013-06-12 11:49:12
Message-ID:	CABUevExpY7JQJj+sKZoArPQNBNzrbU0f-57MpYVg6S9d7q=NDw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> Hi,
>
> On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
>> The attached patch fixes this problem. It just changes walsender so that it
>> waits for all the outstanding WAL records to be replicated to the standby
>> before closing the replication connection.
>
> Imo this is a fix that needs to get backpatched... The code tried to do
> this but failed, I don't think it really gives grounds for valid *new*
> concerns.

+1 (without having looked at the code itself, it's definitely a
behaviour that needs to be fixed)

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Magnus Hagander <magnus(at)hagander(dot)net>
Cc:	Andres Freund <andres(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Clean switchover
Date:	2013-06-12 12:48:39
Message-ID:	20130612124839.GY7200@tamriel.snowman.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

* Magnus Hagander (magnus(at)hagander(dot)net) wrote:
> On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
> >> The attached patch fixes this problem. It just changes walsender so that it
> >> waits for all the outstanding WAL records to be replicated to the standby
> >> before closing the replication connection.
> >
> > Imo this is a fix that needs to get backpatched... The code tried to do
> > this but failed, I don't think it really gives grounds for valid *new*
> > concerns.
>
> +1 (without having looked at the code itself, it's definitely a
> behaviour that needs to be fixed)

Yea, I was also thinking it would be reasonable to backpatch this; it
really looks like a bug that we're allowing this to happen today.

So, +1 on a backpatch for me. I've looked at the patch (it's a
one-liner, plus some additional comments) but havn't looked through the
overall code surrounding it.

Thanks,

Stephen

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	Magnus Hagander <magnus(at)hagander(dot)net>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Clean switchover
Date:	2013-06-12 12:55:22
Message-ID:	20130612125522.GA19792@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2013-06-12 08:48:39 -0400, Stephen Frost wrote:
> * Magnus Hagander (magnus(at)hagander(dot)net) wrote:
> > On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > > On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
> > >> The attached patch fixes this problem. It just changes walsender so that it
> > >> waits for all the outstanding WAL records to be replicated to the standby
> > >> before closing the replication connection.
> > >
> > > Imo this is a fix that needs to get backpatched... The code tried to do
> > > this but failed, I don't think it really gives grounds for valid *new*
> > > concerns.
> >
> > +1 (without having looked at the code itself, it's definitely a
> > behaviour that needs to be fixed)
>
> Yea, I was also thinking it would be reasonable to backpatch this; it
> really looks like a bug that we're allowing this to happen today.
>
> So, +1 on a backpatch for me. I've looked at the patch (it's a
> one-liner, plus some additional comments) but havn't looked through the
> overall code surrounding it.

I've read most of the surrounding code and I think the patch is as
sensible as it can be without reworking the whole walsender main loop
which seems like a job for another day.

I'd personally write
if (caughtup && !pq_is_send_pending() &&
sentPtr == MyWalSnd->flush)
as
if (caughtup && sentPtr == MyWalSnd->flush &&
!pq_is_send_pending())

Since pq_is_send_pending() basically can only be false if the flush
comparison is true. There's the tiny chance that we were sending a
message out just before which is why we should include the
!pq_is_send_pending() condition at all in that if().

But that's fairly, fairly minor.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Clean switchover
Date:	2013-06-13 19:38:57
Message-ID:	CAHGQGwHwjgUT+rTOQPX-0OVdBqW5PjW1Ps2v4tNRBJCsKk04Fw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Jun 12, 2013 at 12:55 PM, Mark Kirkwood
<mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> wrote:
> On 12/06/13 13:15, Stephen Frost wrote:
>>
>> * Fujii Masao (masao(dot)fujii(at)gmail(dot)com) wrote:
>>>
>>> The attached patch fixes this problem. It just changes walsender so that
>>> it
>>> waits for all the outstanding WAL records to be replicated to the standby
>>> before closing the replication connection.
>>
>>
>> Seems like a good idea to me.. Rather surprised that we're not doing
>> this already, to be honest.
>>
>
> Yeah +1 from here too. This would make clean switchovers for (typically)
> testing scenarios a lot less complex and resource intensive (rebuilding of
> the old master as a slave when you know it is ok is despairing on a huge
> db).
>
> On the related note (but not actually to do with this patch),
> clarifying/expanding the docs about the various methods for standby
> promotion:
>
> 1/ trigger file creation
> 2/ pg_ctl promote
> 3/ renaming/removing recovery.conf
>
> and the differences between them would be great. For instance I only
> recently realized that method 3) means the promoted standby does not start a
> new timeline (incidentally - could this be an option to pg_ctl promote)
> which is very useful for (again) controlled/clean switchovers.

In 9.3, you no longer need to worry about the increment of timeline
after the promotion because the standby can automatically follow
the timeline switch.

Regards,

--
Fujii Masao

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	Andres Freund <andres(at)2ndquadrant(dot)com>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Clean switchover
Date:	2013-06-13 19:43:57
Message-ID:	CAHGQGwE=zoGTLTxxRp5e06K3zgi-dMFCns31xDyRsEbnEhHT8A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Jun 12, 2013 at 9:55 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2013-06-12 08:48:39 -0400, Stephen Frost wrote:
>> * Magnus Hagander (magnus(at)hagander(dot)net) wrote:
>> > On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> > > On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
>> > >> The attached patch fixes this problem. It just changes walsender so that it
>> > >> waits for all the outstanding WAL records to be replicated to the standby
>> > >> before closing the replication connection.
>> > >
>> > > Imo this is a fix that needs to get backpatched... The code tried to do
>> > > this but failed, I don't think it really gives grounds for valid *new*
>> > > concerns.
>> >
>> > +1 (without having looked at the code itself, it's definitely a
>> > behaviour that needs to be fixed)
>>
>> Yea, I was also thinking it would be reasonable to backpatch this; it
>> really looks like a bug that we're allowing this to happen today.
>>
>> So, +1 on a backpatch for me. I've looked at the patch (it's a
>> one-liner, plus some additional comments) but havn't looked through the
>> overall code surrounding it.
>
> I've read most of the surrounding code and I think the patch is as
> sensible as it can be without reworking the whole walsender main loop
> which seems like a job for another day.
>
> I'd personally write
> if (caughtup && !pq_is_send_pending() &&
> sentPtr == MyWalSnd->flush)
> as
> if (caughtup && sentPtr == MyWalSnd->flush &&
> !pq_is_send_pending())
>
> Since pq_is_send_pending() basically can only be false if the flush
> comparison is true. There's the tiny chance that we were sending a
> message out just before which is why we should include the
> !pq_is_send_pending() condition at all in that if().

Yep, I updated the patch that way. Thanks for the comment!

Regards,

--
Fujii Masao

Attachment	Content-Type	Size
switchover_v2.patch	application/octet-stream	2.8 KB

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	Magnus Hagander <magnus(at)hagander(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Clean switchover
Date:	2013-06-13 19:56:15
Message-ID:	CAHGQGwGoH639gckduNGzGtcEU-JMXy6gAPpa+bRBUGsC3=n8bg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Jun 12, 2013 at 9:48 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Magnus Hagander (magnus(at)hagander(dot)net) wrote:
>> On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> > On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
>> >> The attached patch fixes this problem. It just changes walsender so that it
>> >> waits for all the outstanding WAL records to be replicated to the standby
>> >> before closing the replication connection.
>> >
>> > Imo this is a fix that needs to get backpatched... The code tried to do
>> > this but failed, I don't think it really gives grounds for valid *new*
>> > concerns.
>>
>> +1 (without having looked at the code itself, it's definitely a
>> behaviour that needs to be fixed)
>
> Yea, I was also thinking it would be reasonable to backpatch this; it
> really looks like a bug that we're allowing this to happen today.
>
> So, +1 on a backpatch for me.

+1. I think that we can backpatch to 9.1, 9.2 and 9.3.

In 9.0, the standby doesn't send back any message to the master and
there is no way to know whether replication has been done up to
the specified location, so I don't think that we can backpatch.

One note is, even if we backpatch, controlled switchover may require
the backup in order to follow the timeline switch, in 9.1 and 9.2.
If we want to avoid the backup in that case, we need to set up
the shared archive area between the master and the standby and
set recovery_target_timeline to 'latest'.

Regards,

--
Fujii Masao

From:	Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Clean switchover
Date:	2013-06-14 01:50:10
Message-ID:	51BA76D2.4030906@catalyst.net.nz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 14/06/13 07:38, Fujii Masao wrote:
> On Wed, Jun 12, 2013 at 12:55 PM, Mark Kirkwood
> <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> wrote:
>> On 12/06/13 13:15, Stephen Frost wrote:
>>>
>>> * Fujii Masao (masao(dot)fujii(at)gmail(dot)com) wrote:
>>>>
>>>> The attached patch fixes this problem. It just changes walsender so that
>>>> it
>>>> waits for all the outstanding WAL records to be replicated to the standby
>>>> before closing the replication connection.
>>>
>>>
>>> Seems like a good idea to me.. Rather surprised that we're not doing
>>> this already, to be honest.
>>>
>>
>> Yeah +1 from here too. This would make clean switchovers for (typically)
>> testing scenarios a lot less complex and resource intensive (rebuilding of
>> the old master as a slave when you know it is ok is despairing on a huge
>> db).
>>
>> On the related note (but not actually to do with this patch),
>> clarifying/expanding the docs about the various methods for standby
>> promotion:
>>
>> 1/ trigger file creation
>> 2/ pg_ctl promote
>> 3/ renaming/removing recovery.conf
>>
>> and the differences between them would be great. For instance I only
>> recently realized that method 3) means the promoted standby does not start a
>> new timeline (incidentally - could this be an option to pg_ctl promote)
>> which is very useful for (again) controlled/clean switchovers.
>
> In 9.3, you no longer need to worry about the increment of timeline
> after the promotion because the standby can automatically follow
> the timeline switch.
>
> Regards,
>

Yes - and that will be awesome. Nice work!

However for those systems still on 9.1/9.2for the time being, some
clarification of the details/differences uisg promotion via the 1) - 3)
would be useful I think.

Note I'm not demanding that you do it - I just happened to be thinking
about this when I saw your patch, so though I'd mention it (I've seen
some brief discussion about this before, but nothing concrete came out
of that in terms of documentation additions).

If folks are keen, I'm willing to attempt to find a suitable place in
the docs to add a discussion about 1) - 3) above. However I was kinda
hoping that someone who knows all the details would... fro instance I've
skimmed the code and note that the system "knows" when it is promoted
via trigger file vs pg_ctl... but as to what if any differences that
makes to the resulting actions (other than removing the trigger file) -
I am not clear on.

Cheers

Mark

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Clean switchover
Date:	2013-06-24 06:41:29
Message-ID:	20130624064129.GE1254@alap2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2013-06-14 04:56:15 +0900, Fujii Masao wrote:
> On Wed, Jun 12, 2013 at 9:48 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > * Magnus Hagander (magnus(at)hagander(dot)net) wrote:
> >> On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> >> > On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
> >> >> The attached patch fixes this problem. It just changes walsender so that it
> >> >> waits for all the outstanding WAL records to be replicated to the standby
> >> >> before closing the replication connection.
> >> >
> >> > Imo this is a fix that needs to get backpatched... The code tried to do
> >> > this but failed, I don't think it really gives grounds for valid *new*
> >> > concerns.
> >>
> >> +1 (without having looked at the code itself, it's definitely a
> >> behaviour that needs to be fixed)
> >
> > Yea, I was also thinking it would be reasonable to backpatch this; it
> > really looks like a bug that we're allowing this to happen today.
> >
> > So, +1 on a backpatch for me.
>
> +1. I think that we can backpatch to 9.1, 9.2 and 9.3.

I marked the patch as ready for committer.

> In 9.0, the standby doesn't send back any message to the master and
> there is no way to know whether replication has been done up to
> the specified location, so I don't think that we can backpatch.

Agreed. 9.0 doesn't have enough infrastructure for this.

> One note is, even if we backpatch, controlled switchover may require
> the backup in order to follow the timeline switch, in 9.1 and 9.2.
> If we want to avoid the backup in that case, we need to set up
> the shared archive area between the master and the standby and
> set recovery_target_timeline to 'latest'.

Fixing this seems outside the scope of this patch... - and rather
unlikely to be backpatchable.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To:	Andres Freund <andres(at)2ndquadrant(dot)com>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Clean switchover
Date:	2013-06-25 17:23:10
Message-ID:	CAHGQGwGKhYLTk3ELis3Eooq1_ZRvnEbxhsgQ-k6OmHf+sGjiBw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Jun 24, 2013 at 3:41 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2013-06-14 04:56:15 +0900, Fujii Masao wrote:
>> On Wed, Jun 12, 2013 at 9:48 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> > * Magnus Hagander (magnus(at)hagander(dot)net) wrote:
>> >> On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> >> > On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
>> >> >> The attached patch fixes this problem. It just changes walsender so that it
>> >> >> waits for all the outstanding WAL records to be replicated to the standby
>> >> >> before closing the replication connection.
>> >> >
>> >> > Imo this is a fix that needs to get backpatched... The code tried to do
>> >> > this but failed, I don't think it really gives grounds for valid *new*
>> >> > concerns.
>> >>
>> >> +1 (without having looked at the code itself, it's definitely a
>> >> behaviour that needs to be fixed)
>> >
>> > Yea, I was also thinking it would be reasonable to backpatch this; it
>> > really looks like a bug that we're allowing this to happen today.
>> >
>> > So, +1 on a backpatch for me.
>>
>> +1. I think that we can backpatch to 9.1, 9.2 and 9.3.
>
> I marked the patch as ready for committer.

Committed. Thanks a lot!

Regards,

--
Fujii Masao