Re: Issues with Quorum Commit

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Issues with Quorum Commit
Date: 2010-10-07 17:22:15
Message-ID: 4CAE01C7.7070505@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/7/10 6:41 AM, Aidan Van Dyk wrote:
> I'm really confused with all this k < N scenarious I see bandied
> about, because, all it really amounts to is "I only want *one*
> syncronous replication, and a bunch of synchrounous replications".
> And a bit of chance thrown in the mix to hope the "syncronous" one is
> pretty stable, the asynchronous ones aren't *too* far behind (define
> too and far at your leisure).

Effectively, yes. The the difference between k of N synch rep and 1
synch standby + several async standbys is that in k of N, you have a
pool and aren't dependent on having a specific standby be very reliable,
just that any one of them is.

So if you have k = 3 and N = 10, then you can have 10 standbys and only
3 of them need to ack any specific commit for the master to proceed. As
long as (a) you retain at least one of the 3 which ack'd, and (b) you
have some way of determining which standby is the most "caught up", data
loss is fairly unlikely; you'd need to lose 4 of the 10, and the wrong
4, to lose data.

The advantage of this for availability over just having k = N = 3 comes
when one of the standbys is responding slowly (due to traffic) or goes
offline unexpectedly due to a hardware failure. In the k = N = 3 case,
the system halts. In the k = 3, N = 10 case, you can lose up to 7
standbys without the system going down.

It's notable that the massively scalable transactional databases
(Dynamo, Cassandra, various telecom databases, etc.) all operate this way.

However, I do consider this "advanced" functionality and not worth
pursuing until we have the k = 1 case implemented and well-tested. For
comparison, Cassandra, Hypertable and Riak have been working on their k
< N functionality for a couple years now and none of them has it stable
*and* fast.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2010-10-07 17:27:38 Re: standby registration (was: is sync rep stalled?)
Previous Message Kevin Grittner 2010-10-07 17:21:21 Re: [HACKERS] MIT benchmarks pgsql multicore (up to 48)performance