Re: standby registration (was: is sync rep stalled?)

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Josh Berkus <josh(at)agliodbs(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 16:40:55
Message-ID: 1286296855.2025.1737.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2010-10-05 at 11:46 -0400, Robert Haas wrote:
> On Tue, Oct 5, 2010 at 10:46 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > On Tue, 2010-10-05 at 10:41 -0400, Robert Haas wrote:
> >> >>
> >> >> When you have one server functioning at each site you'll block until
> >> >> you get a third machine back, rather than replicating to both sites
> >> >> and remaining functional.
> >> >
> >> > And that is so important a consideration that you would like to move
> >> > from one parameter in one file to a whole set of parameters, set
> >> > differently in 5 separate files?
> >>
> >> I don't accept that this is the trade-off being proposed. You seem
> >> convinced that having the config all in one place on the master is
> >> going to make things much more complicated, but I can't see why.
> >
> > But it is not "all in one place" because the file needs to be different
> > on 5 separate nodes. Which *does* make it more complicated than the
> > alternative is a single parameter, set the same everywhere.
>
> Well, you only need to have the file at all on nodes you want to fail
> over to. And aren't you going to end up rejiggering the config when
> you fail over anyway, based on what happened? I mean, suppose you
> have three servers and you require sync rep to 2 slaves. If the
> master falls over and dies, it seems likely you're going to want to
> relax that restriction. Or suppose you have three servers and you
> require sync rep to 1 slave. The first time you fail over, you're
> going to probably want to leave that config as-is, but if you fail
> over again, you're very likely going to want to change it.

Single failovers are common. Multiple failovers aren't. For me, the key
question is about what is the common case, not edge cases.

> This is really the key question for me. If distributing the
> configuration throughout the cluster meant that we could just fail
> over and keep on trucking, that would be, well, really neat, and a
> very compelling argument for the design you are proposing.

Good, thanks.

The important thing is in the minutes and hours immediately after
failover it will all still work; there is no need to change to a
different and very likely untested config.

If you configure N+1 or N+2 redundancy, we should assume that if you
lose a node you will be striving to quickly replace it rather than shrug
and say "you lose some". And note as well, that when you do add that
other node back in, you won't need to change the config back again
afterwards. It all just works and keeps working, so the DBA can spend
his time investigating the issue and seeing if they can get the original
master back up, not keeping one eye on the config files of the remaining
servers.

> But since
> that seems impossible to me, I'm arguing for centralizing the
> configuration file for ease of management.

You can't "centralize" something in 5 different places, at least not in
my understanding of the word.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-10-05 16:41:23 Re: querying the version of libpq
Previous Message Alvaro Herrera 2010-10-05 16:39:17 Re: pg_filedump for 9.0?