Re: Standalone synchronous master

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, MauMau <maumau307(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Standalone synchronous master
Date: 2014-01-10 22:45:22
Message-ID: 20140110224522.GB13568@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2014-01-10 17:28:55 -0500, Stephen Frost wrote:
> > Why do you know that you didn't loose any transactions? Trivial network
> > hiccups, a restart of a standby, IO overload on the standby all can
> > cause a very short interruptions in the walsender connection - leading
> > to degradation.

> You know that you haven't *lost* any by virtue of the master still being
> up. The case you describe is a double-failure scenario- the link between
> the master and slave has to go away AND the master must accept a
> transaction and then fail independently.

Unfortunately network outages do correlate with other system
faults. What you're wishing for really is the "I like the world to be
friendly to me" mode.
Even if you have only disk problems, quite often if your disks die, you
can continue to write (especially with a BBU), but uncached reads
fail. So the walsender connection errors out because a read failed, and
youre degrading into async mode. *Because* your primary is about to die.

> > > As pointed out by someone
> > > previously, that's how RAID-1 works (which I imagine quite a few of us
> > > use).
> >
> > I don't think that argument makes much sense. Raid-1 isn't safe
> > as-is. It's only safe if you use some sort of journaling or similar
> > ontop. If you issued a write during a crash you normally will just get
> > either the version from before or the version after the last write back,
> > depending on the state on the individual disks and which disk is treated
> > as authoritative by the raid software.

> Uh, you need a decent raid controller then and we're talking about after a
> transaction commit/sync.

Yes, if you have a BBU that memory is authoritative in most cases. But
in that case the argument of having two disks is pretty much pointless,
the SPOF suddenly became the battery + ram.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2014-01-10 22:47:13 Re: Standalone synchronous master
Previous Message Joshua D. Drake 2014-01-10 22:44:28 Re: Standalone synchronous master