Re: Behaviour of take over the synchronous replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Behaviour of take over the synchronous replication
Date: 2013-08-28 13:59:28
Message-ID: CAA4eK1+9eHGoZFU_iWaLapG=qPN_4te6KO6Ai+ufXbyRLcjVvA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 27, 2013 at 4:51 PM, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Sun, Aug 25, 2013 at 3:21 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> On Sat, Aug 24, 2013 at 2:46 PM, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>> On Sat, Aug 24, 2013 at 3:14 AM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>>> On 08/23/2013 12:42 AM, Sawada Masahiko wrote:
>>>>> in case (a), those priority is clear. So I think that re-taking over
>>>>> is correct behaviour.
>>>>> OHOT, in case (b), even if AAA and BBB are set same priority, AAA
>>>>> server steals SYNC replication.
>>>>> I think it is better that BBB server continue behaviour SYNC standby,
>>>>> and AAA should become potential server.
>>>>
>>>> So, you're saying that:
>>>>
>>>> 1) synchronous_standby_names = '*'
>>>>
>>>> 2) replica 'BBB' is the current sync standby
>>>>
>>>> 3) replica 'AAA' comes online
>>>>
>>>> 4) replica 'AAA' grabs sync status
>>>>
>>>> ?
>>> I'm sorry that you are confuse.
>>> It means that
>>>
>>> 1) synchronous_standby_names = '*'
>>>
>>> 2) replica 'AAA' is the current sync standby
>>>
>>> 3) replica 'BBB' is the current async standby (potential sync standby)
>>>
>>> 4) replica 'AAA' fail. after that, replica 'BBB' is current sync standby.
>>>
>>> 5) replica 'AAA' comes online
>>>
>>> 6) replica 'AAA' grabs sync status
>>>
>>>>
>>>
>>>
>>>> If that's the case, I'm not really sure that's undesirable behavior.
>>>> One could argue fairly persuasively that if you care about the
>>>> precendence order of sync replicas, you shouldn't use '*'. And the rule
>>>> of "if using *, the lowest-sorted replica name has sync" is actually a
>>>> predictable, easy-to-understand rule.
>>>>
>>>> So if you want to make this a feature request, you'll need to come up
>>>> with an argument as to why the current behavior is bad. Otherwise,
>>>> you're just asking us to document it better (which is a good idea).
>>> It is not depend on name of standby server. That is, The standby server,
>>> which was connected to the master server during initial configration
>>> replication, is top priority even if priority of two server are same.
>>
>> What is happening here is that incase of '*' as priority of both are
>> same, system will choose whichever
>> comes in list of registered standby's first (list is maintained in
>> structure WalSndCtl).
>> Each standby is registered with WalSndCtl when a new WALSender is
>> started in function InitWalSenderSlot().
>> As 'AAA' has been registered first it becomes preferred sync standby
>> even if priorities of both are same.
>> When 'AAA' goes down, it marks that Slot entry as free (by setting
>> pid=0 in function WalSndKill),
>> now when 'AAA' comes back again, it gets that free Slot entry and
>> again becomes preferred sync standby.
>>
>> Now if we want to fix as you are suggesting which I don't think is
>> necessary, we might need to change WalSndKill and some other place so
>> that whenever any standby goes down, it changes slots for already
>> registered standby's.
>>> User must remember that which standby server connected to master server at
>>> first.
>>> I think that this behavior confuse user.
>>> so I think that we need to modify this behaviour or if '*' is used, priority
>>> of server is not same (modifying manual is also good).
>>
>> Here user has done the settings (setting synchronous_standby_names =
>> '*'), after which he will not have any control which standby will
>> become sync standby, so ideally he should not complain.
>>
>> It might be case that for some users current behavior is good enough
>> which means that with '*' whichever standby has become sync standby
>> first, it will be the sync standby always if alive.

> I'm thinking that it is not necessary to change WalSndKill.
> For example, we add the value (e.g., sync_standby) which have that
> which wal sender is active SYNC rep.
> And if sync_standby is already set and it is active, server doesn't
> looking for active standby.
> Only if sync_standby is not set and it is inactive, server looking for
> that which server is active SYNC rep.
> If so, we also prevent to find active SYNC rep whenever
> SyncRepReleaseWaiters() is called.
For '*' case, it will be okay, but when the user has given proper
names, in that case even if there is any active Sync
Rep, it has to be changed based on priority.

I think here how to provide a fix, so that behavior gets changed to
what you describe is a second priority work, first
is to show the value of use-case. Do you really know where people
actually setup using '*' as configuration and if
yes, are they annoyed with current behavior?

I have thought about it, but could imagine a scenario where people
will be using '*' in their production
configurations, may be it will be useful in test labs.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2013-08-28 14:42:17 Re: split postmaster's checkDataDir to src/common
Previous Message Tom Lane 2013-08-28 13:43:47 Re: Deprecating RULES