Re: Issues with Quorum Commit

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Markus Wanner <markus(at)bluegap(dot)ch>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Issues with Quorum Commit
Date: 2010-10-08 19:31:58
Message-ID: 4CAF71AE.5040006@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 08.10.2010 17:26, Fujii Masao wrote:
> On Fri, Oct 8, 2010 at 5:10 PM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> Do we really need that?
>
> Yes. But if there is no unsent WAL when the master goes down,
> we can start new standby without new backup by copying the
> timeline history file from new master to new standby and
> setting recovery_target_timeline to 'latest'.

.. and restart the standby.

> In this case,
> new standby advances the recovery to the latest timeline ID
> which new master uses before connecting to the master.
>
> This seems to have been successful in my test environment.
> Though I'm missing something.

Yeah, that should work, but it's awfully complicated.

>> I don't think that's acceptable, we'll need to fix
>> that if that's the case.
>
> Agreed.
>
>> You can cross timelines with the archive, though. But IIRC there was some
>> issue with that too, you needed to restart the standbys because the standby
>> scans what timelines exist at the beginning of recovery, and won't notice
>> new timelines that appear after that?
>
> Yes.
>
>> We need to address that, apart from any of the other things discussed wrt.
>> synchronous replication. It will benefit asynchronous replication too. IMHO
>> *that* is the next thing we should do, the next patch we commit.
>
> You mean to commit that capability before synchronous replication? If so,
> I disagree with you. I think that it's not easy to address that problem.
> So I'm worried about that implementing that capability first means the miss
> of sync rep in 9.1.

It's a pretty severe shortcoming at the moment. For starters, it means
that you need a shared archive, even if you set wal_keep_segments to a
high number. Secondly, it's a lot of scripting to get it working, I
don't like the thought of testing failovers in synchronous replication
if I have to do all that. Frankly, this seems more important to me than
synchronous replication.

It shouldn't be too hard to fix. Walsender needs to be able to read WAL
from preceding timelines, like recovery does, and walreceiver needs to
write the incoming WAL to the right file.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-10-08 19:49:24 Re: WIP: Triggers on VIEWs
Previous Message Rob Wultsch 2010-10-08 18:33:43 Re: Issues with Quorum Commit