Re: Behaviour of take over the synchronous replication

From: Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Behaviour of take over the synchronous replication
Date: 2013-08-30 06:03:53
Message-ID: CAD21AoBr+NBo2g++wd5x8Q1HLOMSHXOQB2c4ZttzZmm5T2Xj1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 28, 2013 at 10:59 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Tue, Aug 27, 2013 at 4:51 PM, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> On Sun, Aug 25, 2013 at 3:21 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> On Sat, Aug 24, 2013 at 2:46 PM, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>>> On Sat, Aug 24, 2013 at 3:14 AM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>>>> On 08/23/2013 12:42 AM, Sawada Masahiko wrote:
>>>>>> in case (a), those priority is clear. So I think that re-taking over
>>>>>> is correct behaviour.
>>>>>> OHOT, in case (b), even if AAA and BBB are set same priority, AAA
>>>>>> server steals SYNC replication.
>>>>>> I think it is better that BBB server continue behaviour SYNC standby,
>>>>>> and AAA should become potential server.
>>>>>
>>>>> So, you're saying that:
>>>>>
>>>>> 1) synchronous_standby_names = '*'
>>>>>
>>>>> 2) replica 'BBB' is the current sync standby
>>>>>
>>>>> 3) replica 'AAA' comes online
>>>>>
>>>>> 4) replica 'AAA' grabs sync status
>>>>>
>>>>> ?
>>>> I'm sorry that you are confuse.
>>>> It means that
>>>>
>>>> 1) synchronous_standby_names = '*'
>>>>
>>>> 2) replica 'AAA' is the current sync standby
>>>>
>>>> 3) replica 'BBB' is the current async standby (potential sync standby)
>>>>
>>>> 4) replica 'AAA' fail. after that, replica 'BBB' is current sync standby.
>>>>
>>>> 5) replica 'AAA' comes online
>>>>
>>>> 6) replica 'AAA' grabs sync status
>>>>
>>>>>
>>>>
>>>>
>>>>> If that's the case, I'm not really sure that's undesirable behavior.
>>>>> One could argue fairly persuasively that if you care about the
>>>>> precendence order of sync replicas, you shouldn't use '*'. And the rule
>>>>> of "if using *, the lowest-sorted replica name has sync" is actually a
>>>>> predictable, easy-to-understand rule.
>>>>>
>>>>> So if you want to make this a feature request, you'll need to come up
>>>>> with an argument as to why the current behavior is bad. Otherwise,
>>>>> you're just asking us to document it better (which is a good idea).
>>>> It is not depend on name of standby server. That is, The standby server,
>>>> which was connected to the master server during initial configration
>>>> replication, is top priority even if priority of two server are same.
>>>
>>> What is happening here is that incase of '*' as priority of both are
>>> same, system will choose whichever
>>> comes in list of registered standby's first (list is maintained in
>>> structure WalSndCtl).
>>> Each standby is registered with WalSndCtl when a new WALSender is
>>> started in function InitWalSenderSlot().
>>> As 'AAA' has been registered first it becomes preferred sync standby
>>> even if priorities of both are same.
>>> When 'AAA' goes down, it marks that Slot entry as free (by setting
>>> pid=0 in function WalSndKill),
>>> now when 'AAA' comes back again, it gets that free Slot entry and
>>> again becomes preferred sync standby.
>>>
>>> Now if we want to fix as you are suggesting which I don't think is
>>> necessary, we might need to change WalSndKill and some other place so
>>> that whenever any standby goes down, it changes slots for already
>>> registered standby's.
>>>> User must remember that which standby server connected to master server at
>>>> first.
>>>> I think that this behavior confuse user.
>>>> so I think that we need to modify this behaviour or if '*' is used, priority
>>>> of server is not same (modifying manual is also good).
>>>
>>> Here user has done the settings (setting synchronous_standby_names =
>>> '*'), after which he will not have any control which standby will
>>> become sync standby, so ideally he should not complain.
>>>
>>> It might be case that for some users current behavior is good enough
>>> which means that with '*' whichever standby has become sync standby
>>> first, it will be the sync standby always if alive.
>
>> I'm thinking that it is not necessary to change WalSndKill.
>> For example, we add the value (e.g., sync_standby) which have that
>> which wal sender is active SYNC rep.
>> And if sync_standby is already set and it is active, server doesn't
>> looking for active standby.
>> Only if sync_standby is not set and it is inactive, server looking for
>> that which server is active SYNC rep.
>> If so, we also prevent to find active SYNC rep whenever
>> SyncRepReleaseWaiters() is called.
> For '*' case, it will be okay, but when the user has given proper
> names, in that case even if there is any active Sync
> Rep, it has to be changed based on priority.
>
> I think here how to provide a fix, so that behavior gets changed to
> what you describe is a second priority work, first
> is to show the value of use-case. Do you really know where people
> actually setup using '*' as configuration and if
> yes, are they annoyed with current behavior?
> I have thought about it, but could imagine a scenario where people
> will be using '*' in their production
> configurations, may be it will be useful in test labs.

Thank you for your feedback.
I have implemented the patch which change how to put priority on each
walsender, based on I suggested.
I added sync_standby value into WalCtl value. This value has that
which walsender is active sync rep.
This patch handle also for case that user has given proper names.

Regards,

-------
Sawada Masahiko

Attachment Content-Type Size
Sync_standby_priority_v1.patch application/octet-stream 3.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2013-08-30 06:05:38 Re: Compression of full-page-writes
Previous Message Fujii Masao 2013-08-30 06:03:39 Re: Compression of full-page-writes