Re: Standalone synchronous master

From: Robert Treat <rob(at)xzilla(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Standalone synchronous master
Date: 2014-01-09 04:09:01
Message-ID: CABV9wwMcOQbb6YY-dF5qJ=B8di3xrco-bkrMvwj7A8o+jvqv3Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 8, 2014 at 6:15 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Stephen,
>
>
>> I'm aware, my point was simply that we should state, up-front in
>> 25.2.7.3 *and* where we document synchronous_standby_names, that it
>> requires at least three servers to be involved to be a workable
>> solution.
>
> It's a workable solution with 2 servers. That's a "low-availability,
> high-integrity" solution; the user has chosen to double their risk of
> not accepting writes against never losing a write. That's a perfectly
> valid configuration, and I believe that NTT runs several applications
> this way.
>
> In fact, that can already be looked at as a kind of "auto-degrade" mode:
> if there aren't two nodes, then the database goes read-only.
>
> Might I also point out that transactions are synchronous or not
> individually? The sensible configuration is for only the important
> writes being synchronous -- in which case auto-degrade makes even less
> sense.
>
> I really think that demand for auto-degrade is coming from users who
> don't know what sync rep is for in the first place. The fact that other
> vendors are offering auto-degrade as a feature instead of the ginormous
> foot-gun it is adds to the confusion, but we can't help that.
>

I think the problem here is that we tend to have a limited view of
"the right way to use synch rep". If I have 5 nodes, and I set 1
synchronous and the other 3 asynchronous, I've set up a "known
successor" in the event that the leader fails. In this scenario
though, if the "successor" fails, you actually probably want to keep
accepting writes; since you weren't using synchronous for durability
but for operational simplicity. I suspect there are probably other
scenarios where users are willing to trade latency for improved and/or
directed durability but not at the extent of availability, don't you?

In fact there are entire systems that provide that type of thing. I
feel like it's worth mentioning that there's a nice primer on tunable
consistency in the Riak docs; strongly recommended.
http://docs.basho.com/riak/1.1.0/tutorials/fast-track/Tunable-CAP-Controls-in-Riak/.
I'm not entirely sure how well it maps into our problem space, but it
at least gives you a sane working model to think about. If you were
trying to explain the Postgres case, async is like the N value (I want
the data to end up on this many nodes eventually) and sync is like the
W value (it must be written to this many nodes, or it should fail). Of
course, we only offer an R = 1, W = 1 or 2, and N = all. And it's
worse than that, because we have golden nodes.

This isn't to say there isn't a lot of confusion around the issue.
Designing, implementing, and configuring different guarantees in the
presence of node failures is a non-trivial problem. Still, I'd prefer
to see Postgres head in the direction of providing more options in
this area rather than drawing a firm line at being a CP-oriented
system.

Robert Treat
play: xzilla.net
work: omniti.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Karlsson 2014-01-09 04:48:42 Re: Planning time in explain/explain analyze
Previous Message Mark Dilger 2014-01-09 03:34:25 Re: In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier