Re: New and interesting replication issues with 9.2.8 sync rep

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: New and interesting replication issues with 9.2.8 sync rep
Date: 2014-05-05 17:30:17
Message-ID: 5367CAA9.5080705@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05/05/2014 10:25 AM, Andres Freund wrote:
> On 2014-05-05 10:16:27 -0700, Josh Berkus wrote:
>> On 05/03/2014 01:07 AM, Andres Freund wrote:
>>> On 2014-05-02 18:57:08 -0700, Josh Berkus wrote:
>>>> Just got a report of a replication issue with 9.2.8 from a community member:
>>>>
>>>> Here's the sequence:
>>>>
>>>> 1) A --> B (sync rep)
>>>>
>>>> 2) Shut down B
>>>>
>>>> 3) Shut down A
>>>>
>>>> 4) Start up B as a master
>>>>
>>>> 5) Start up A as sync replica of B
>>>>
>>>> 6) A successfully joins B as a sync replica, even though its transaction
>>>> log is 1016 bytes *ahead* of B.
>>>>
>>>> 7) Transactions written to B all hang
>>>>
>>>> 8) Xlog on A is now corrupt, although the database itself is OK
>>>
>>> This is fundamentally borked practice.
>>>
>>>> Now, the above sequence happened because of the user misunderstanding
>>>> what sync rep really means. However, A should not have been able to
>>>> connect with B in replication mode, especially in sync rep mode; that
>>>> should have failed. Any thoughts on why it didn't?
>>>
>>> I'd guess that B, while starting up, has written further WAL records
>>> bringing it further ahead of A.
>>
>> Apparently not; from what I've seen pg_stat_replication even *shows*
>> that the replica is ahead of the master. Futher, Postgres should have
>> recognized that there was a timeline branch point before A's last
>> record, no?
>
> There wasn't any timeline increase because - as far as I understand the
> above - there wasn't any promotion. The cluster was shut down and
> recovery.conf was created/removed respectively.

Ah, oops, left out a step. B was promoted.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2014-05-05 17:30:26 Docs for 9.4's worker_spi?
Previous Message Andres Freund 2014-05-05 17:27:40 Re: pgsql: Revive line type