Re: standby registration (was: is sync rep stalled?)

Lists: pgsql-hackers
From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Markus Wanner <markus(at)bluegap(dot)ch>
Cc: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-04 15:20:30
Message-ID: AANLkTi=f7xvzKrDrPVb6PAGXLPweQY==rHVNk_1sZnAv@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Oct 4, 2010 at 3:08 AM, Markus Wanner <markus(at)bluegap(dot)ch> wrote:
> On 10/01/2010 05:06 PM, Dimitri Fontaine wrote:
>> Wait forever can be done without standby registration, with quorum commit.
>
> Yeah, I also think the only reason for standby registration is ease of
> configuration (if at all). There's no technical requirement for standby
> registration, AFAICS. Or does anybody know of a realistic use case
> that's possible with standby registration, but not with quorum commit?

Quorum commit, even with configurable vote weights, can't handle a
requirement that a particular commit be replicated to (A || B) && (C
|| D).

The use case is something like "we want to make sure we've replicated
to at least one of the two servers in the Berlin datacenter and at
least one of the two servers in the Hong Kong datacenter".

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Markus Wanner <markus(at)bluegap(dot)ch>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-04 17:57:50
Message-ID: 4CAA159E.1060000@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/04/2010 05:20 PM, Robert Haas wrote:
> Quorum commit, even with configurable vote weights, can't handle a
> requirement that a particular commit be replicated to (A || B) && (C
> || D).

Good point.

Can the proposed standby registration configuration format cover such a
requirement?

Regards

Markus Wanner


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Markus Wanner <markus(at)bluegap(dot)ch>
Cc: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-04 19:02:38
Message-ID: AANLkTinzEP8KO6h5Zi18EzSwC9S-aa92YJccpW79g+so@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Oct 4, 2010 at 1:57 PM, Markus Wanner <markus(at)bluegap(dot)ch> wrote:
> On 10/04/2010 05:20 PM, Robert Haas wrote:
>> Quorum commit, even with configurable vote weights, can't handle a
>> requirement that a particular commit be replicated to (A || B) && (C
>> || D).
>
> Good point.
>
> Can the proposed standby registration configuration format cover such a
> requirement?

Well, if you can name the standbys, there's no reason there couldn't
be a parameter that takes a string that looks pretty much like the
above. There are, of course, some situations that could be handled
more elegantly by quorum commit ("any 3 of 5 available standbys") but
the above is more general and not unreasonably longwinded for
reasonable numbers of standbys.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: David Christensen <david(at)endpoint(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Markus Wanner <markus(at)bluegap(dot)ch>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-04 19:25:19
Message-ID: 435F2588-894C-4C4B-A7ED-32776EE8A22C@endpoint.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Oct 4, 2010, at 2:02 PM, Robert Haas wrote:

> On Mon, Oct 4, 2010 at 1:57 PM, Markus Wanner <markus(at)bluegap(dot)ch> wrote:
>> On 10/04/2010 05:20 PM, Robert Haas wrote:
>>> Quorum commit, even with configurable vote weights, can't handle a
>>> requirement that a particular commit be replicated to (A || B) && (C
>>> || D).
>>
>> Good point.
>>
>> Can the proposed standby registration configuration format cover such a
>> requirement?
>
> Well, if you can name the standbys, there's no reason there couldn't
> be a parameter that takes a string that looks pretty much like the
> above. There are, of course, some situations that could be handled
> more elegantly by quorum commit ("any 3 of 5 available standbys") but
> the above is more general and not unreasonably longwinded for
> reasonable numbers of standbys.

Is there any benefit to be had from having standby roles instead of individual names? For instance, you could integrate this into quorum commit to express 3 of 5 "reporting" standbys, 1 "berlin" standby and 1 "tokyo" standby from a group of multiple per data center, or even just utilize role sizes of 1 if you wanted individual standbys to be "named" in this fashion. This role could be provided on connect of the standby is more-or-less tangential to the specific registration issue.

Regards,

David
--
David Christensen
End Point Corporation
david(at)endpoint(dot)com


From: Mike Rylander <mrylander(at)gmail(dot)com>
To: David Christensen <david(at)endpoint(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-04 19:29:48
Message-ID: AANLkTimtw6oS4Bt6f=5Nn0vmziY5xGyTamGQMwdZP-vP@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Oct 4, 2010 at 3:25 PM, David Christensen <david(at)endpoint(dot)com> wrote:
>
> On Oct 4, 2010, at 2:02 PM, Robert Haas wrote:
>
>> On Mon, Oct 4, 2010 at 1:57 PM, Markus Wanner <markus(at)bluegap(dot)ch> wrote:
>>> On 10/04/2010 05:20 PM, Robert Haas wrote:
>>>> Quorum commit, even with configurable vote weights, can't handle a
>>>> requirement that a particular commit be replicated to (A || B) && (C
>>>> || D).
>>>
>>> Good point.
>>>
>>> Can the proposed standby registration configuration format cover such a
>>> requirement?
>>
>> Well, if you can name the standbys, there's no reason there couldn't
>> be a parameter that takes a string that looks pretty much like the
>> above.  There are, of course, some situations that could be handled
>> more elegantly by quorum commit ("any 3 of 5 available standbys") but
>> the above is more general and not unreasonably longwinded for
>> reasonable numbers of standbys.
>
>
> Is there any benefit to be had from having standby roles instead of individual names?  For instance, you could integrate this into quorum commit to express 3 of 5 "reporting" standbys, 1 "berlin" standby and 1 "tokyo" standby from a group of multiple per data center, or even just utilize role sizes of 1 if you wanted individual standbys to be "named" in this fashion.  This role could be provided on connect of the standby is more-or-less tangential to the specific registration issue.
>

Big +1 FWIW.

--
Mike Rylander


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Markus Wanner <markus(at)bluegap(dot)ch>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-04 19:45:26
Message-ID: 4CAA2ED6.6030000@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


>>> Quorum commit, even with configurable vote weights, can't handle a
>>> requirement that a particular commit be replicated to (A || B) && (C
>>> || D).
>> Good point.

If this is the only feature which standby registration is needed for,
has anyone written the code for it yet? Is anyone planning to?

If not, it seems like standby registration is not *required* for 9.1. I
still tend to think it would be nice to have from a DBA perspective, but
we should separate required from "nice to have".

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: David Christensen <david(at)endpoint(dot)com>
Cc: Markus Wanner <markus(at)bluegap(dot)ch>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-04 21:32:27
Message-ID: AANLkTin0R7t+eyXTQceJhWF1XveaZ2Rb4-2uucGJJKb7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Oct 4, 2010 at 3:25 PM, David Christensen <david(at)endpoint(dot)com> wrote:
>
> On Oct 4, 2010, at 2:02 PM, Robert Haas wrote:
>
>> On Mon, Oct 4, 2010 at 1:57 PM, Markus Wanner <markus(at)bluegap(dot)ch> wrote:
>>> On 10/04/2010 05:20 PM, Robert Haas wrote:
>>>> Quorum commit, even with configurable vote weights, can't handle a
>>>> requirement that a particular commit be replicated to (A || B) && (C
>>>> || D).
>>>
>>> Good point.
>>>
>>> Can the proposed standby registration configuration format cover such a
>>> requirement?
>>
>> Well, if you can name the standbys, there's no reason there couldn't
>> be a parameter that takes a string that looks pretty much like the
>> above.  There are, of course, some situations that could be handled
>> more elegantly by quorum commit ("any 3 of 5 available standbys") but
>> the above is more general and not unreasonably longwinded for
>> reasonable numbers of standbys.
>
>
> Is there any benefit to be had from having standby roles instead of individual names?  For instance, you could integrate this into quorum commit to express 3 of 5 "reporting" standbys, 1 "berlin" standby and 1 "tokyo" standby from a group of multiple per data center, or even just utilize role sizes of 1 if you wanted individual standbys to be "named" in this fashion.  This role could be provided on connect of the standby is more-or-less tangential to the specific registration issue.

It's possible to construct a commit rule that is sufficiently complex
that this can't handle it, but it has to be pretty hairy. The
simplest example I can think of is A || ((B || C) && (D || E)). And
you could even handle that if you allow standbys to belong to multiple
roles; in fact, I think you can handle arbitrary Boolean formulas that
way by converting to conjunctive normal form. The use cases for such
complex formulas are fairly thin, though, so I'm not sure that's a
very compelling argument one way or the other. I think in the end
this is not much different from standby registration; you still have
registrations, they just represent groups of machines instead of
single machines.

I think from a reporting point of view it's a little nicer to have
individual registrations rather than group registrations. For
example, you might ask the master which slaves are connected and where
they are in the WAL stream, or something like that, and with
individual standby names that's a bit easier to puzzle out. Of
course, you could have individual standby names (that are only for
identification) and use groups for everything else. That's maybe a
bit more complicated (each slave needs to send the master a
name-for-identification and a group) but it's certainly workable. We
might also in the future have cases where you want to group standbys
in one way for the commit-rule and another way for some other setting,
but I can't think of exactly what other setting you'd be likely to
want to set in a fashion orthogonal from commit rule, and even if we
did think of one, allowing standbys to be members of multiple groups
would solve that problem, too. That feels a bit more complex to me,
but it's not that likely to happen in practice, so it would probably
be OK. So I guess I think individual registrations are a bit cleaner
and likely to lead to slightly fewer knobs over the long term, but
group registrations seem like they could be made to work, too.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Markus Wanner <markus(at)bluegap(dot)ch>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: David Christensen <david(at)endpoint(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 06:57:09
Message-ID: 4CAACC45.2030802@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/04/2010 11:32 PM, Robert Haas wrote:
> I think in the end
> this is not much different from standby registration; you still have
> registrations, they just represent groups of machines instead of
> single machines.

Such groups are often easy to represent in CIDR notation, which would
reduce the need for registering every single standby.

Anyway, I'm really with Josh on this. It's a configuration debate that
doesn't have much influence on the real implementation. As long as we
keep the 'what nodes and how long does the master wait' decision
flexible enough.

Regards

Markus Wanner


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration
Date: 2010-10-05 11:26:39
Message-ID: m2wrpwric0.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
>>>> Quorum commit, even with configurable vote weights, can't handle a
>>>> requirement that a particular commit be replicated to (A || B) && (C
>>>> || D).
>>> Good point.

So I've been trying to come up with something manually and failed. I
blame the fever — without it maybe I wouldn't have tried…

Now, if you want this level of precision in the setup, all we seem to be
missing from the quorum facility as currently proposed would be to have
a quorum list instead (or a max, but that's not helping the "easy" side).

Given those weights: A3 B2 C4 D4 you can ask for a quorum of 6 and
you're covered for your case, except that C&&D is when you reach the
quorum but don't have what you asked. Have the quorum input accept [6,7]
and it's easy to setup. Do we want that?

> If not, it seems like standby registration is not *required* for 9.1. I
> still tend to think it would be nice to have from a DBA perspective, but
> we should separate required from "nice to have".

+1.
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: David Christensen <david(at)endpoint(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 12:33:29
Message-ID: 1286282009.2025.1292.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 2010-10-04 at 14:25 -0500, David Christensen wrote:

> Is there any benefit to be had from having standby roles instead of
> individual names? For instance, you could integrate this into quorum
> commit to express 3 of 5 "reporting" standbys, 1 "berlin" standby and
> 1 "tokyo" standby from a group of multiple per data center, or even
> just utilize role sizes of 1 if you wanted individual standbys to be
> "named" in this fashion. This role could be provided on connect of
> the standby is more-or-less tangential to the specific registration
> issue.

There is substantial benefit in that config.

If we want to do relaying and path minimization, as is possible with
Slony, we would want to do

M -> S1 -> S2 where M is in London, S1 and S2 are in Berlin.

so that the master sends data only once to Berlin.

If we send to a group, we can also allow things to continue working if
S1 goes down, since S2 might then know it could connect to M directly.

That's complex and not something for the first release, IMHO.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 12:34:06
Message-ID: 1286282046.2025.1296.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 2010-10-04 at 12:45 -0700, Josh Berkus wrote:
> >>> Quorum commit, even with configurable vote weights, can't handle a
> >>> requirement that a particular commit be replicated to (A || B) && (C
> >>> || D).
> >> Good point.

Asking for quorum_commit = 3 would cover that requirement.

Not exactly as requested, but in a way that is both simpler to express
and requires no changes to configuration after failover. ISTM better to
have a single parameter than 5 separate configuration files, with
behaviour that the community would not easily be able to validate.

> If this is the only feature which standby registration is needed for,
> has anyone written the code for it yet? Is anyone planning to?

(Not me)

> If not, it seems like standby registration is not *required* for 9.1. I
> still tend to think it would be nice to have from a DBA perspective, but
> we should separate required from "nice to have".

Agreed.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 12:57:35
Message-ID: AANLkTi=Z_W3X--5SVacWKkfFL1dH5Y3W0DL1WWtVNkDR@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Oct 5, 2010 at 8:34 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Mon, 2010-10-04 at 12:45 -0700, Josh Berkus wrote:
>> >>> Quorum commit, even with configurable vote weights, can't handle a
>> >>> requirement that a particular commit be replicated to (A || B) && (C
>> >>> || D).
>> >> Good point.
>
> Asking for quorum_commit = 3 would cover that requirement.
>
> Not exactly as requested, but in a way that is both simpler to express
> and requires no changes to configuration after failover. ISTM better to
> have a single parameter than 5 separate configuration files, with
> behaviour that the community would not easily be able to validate.

That's just not the same thing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 14:00:00
Message-ID: 1286287200.2025.1386.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2010-10-05 at 08:57 -0400, Robert Haas wrote:
> On Tue, Oct 5, 2010 at 8:34 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > On Mon, 2010-10-04 at 12:45 -0700, Josh Berkus wrote:
> >> >>> Quorum commit, even with configurable vote weights, can't handle a
> >> >>> requirement that a particular commit be replicated to (A || B) && (C
> >> >>> || D).
> >> >> Good point.
> >
> > Asking for quorum_commit = 3 would cover that requirement.
> >
> > Not exactly as requested, but in a way that is both simpler to express
> > and requires no changes to configuration after failover. ISTM better to
> > have a single parameter than 5 separate configuration files, with
> > behaviour that the community would not easily be able to validate.
>
> That's just not the same thing.

In what important ways does it differ? In both cases, no reply will be
received until both sites have confirmed receipt.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Simon Riggs" <simon(at)2ndQuadrant(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, "Markus Wanner" <markus(at)bluegap(dot)ch>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>, "Dimitri Fontaine" <dfontaine(at)hi-media(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 14:07:08
Message-ID: 4CAAEABC0200002500036504@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
> Robert Haas wrote:
>> Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> Josh Berkus wrote:
>>>>>>> Quorum commit, even with configurable vote weights, can't
>>>>>>> handle a requirement that a particular commit be replicated
>>>>>>> to (A || B) && (C || D).
>>>>>> Good point.
>>>
>>> Asking for quorum_commit = 3 would cover that requirement.
>>>
>>> Not exactly as requested,

>> That's just not the same thing.
>
> In what important ways does it differ?

When you have one server functioning at each site you'll block until
you get a third machine back, rather than replicating to both sites
and remaining functional.

-Kevin


From: Markus Wanner <markus(at)bluegap(dot)ch>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 14:18:20
Message-ID: 4CAB33AC.9080507@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/05/2010 04:07 PM, Kevin Grittner wrote:
> When you have one server functioning at each site you'll block until
> you get a third machine back, rather than replicating to both sites
> and remaining functional.

That's not a very likely failure scenario, but yes.

What if the admin wants to add a standby in Berlin, but still wants one
ack from each location? None of the current proposals make that simple
enough to not require adjustment in configuration.

Maybe defining something like: at least one from Berlin and at least one
from Tokyo (where Berlin and Tokyo could be defined by CIDR notation).
IMO that's closer to the admin's reality than a plain quorum but still
not as verbose as a full standby registration.

But maybe we should really defer that discussion...

Regards

Markus Wanner


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 14:33:04
Message-ID: 1286289184.2025.1440.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2010-10-05 at 09:07 -0500, Kevin Grittner wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
> > Robert Haas wrote:
> >> Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> >>> Josh Berkus wrote:
> >>>>>>> Quorum commit, even with configurable vote weights, can't
> >>>>>>> handle a requirement that a particular commit be replicated
> >>>>>>> to (A || B) && (C || D).
> >>>>>> Good point.
> >>>
> >>> Asking for quorum_commit = 3 would cover that requirement.
> >>>
> >>> Not exactly as requested,
>
> >> That's just not the same thing.
> >
> > In what important ways does it differ?
>
> When you have one server functioning at each site you'll block until
> you get a third machine back, rather than replicating to both sites
> and remaining functional.

And that is so important a consideration that you would like to move
from one parameter in one file to a whole set of parameters, set
differently in 5 separate files? Is it a common use case that people
have more than 3 separate servers for one application, which is where
the difference shows itself.

Another check: does specifying replication by server in such detail mean
we can't specify robustness at the transaction level? If we gave up that
feature, it would be a great loss for performance tuning.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Josh Berkus <josh(at)agliodbs(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 14:41:55
Message-ID: AANLkTi=DygRAN1dte6z3kDVnm8HL=7+KLVPGR4n4SARD@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Oct 5, 2010 at 10:33 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Tue, 2010-10-05 at 09:07 -0500, Kevin Grittner wrote:
>> Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
>> > Robert Haas wrote:
>> >> Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> >>> Josh Berkus wrote:
>> >>>>>>> Quorum commit, even with configurable vote weights, can't
>> >>>>>>> handle a requirement that a particular commit be replicated
>> >>>>>>> to (A || B) && (C || D).
>> >>>>>> Good point.
>> >>>
>> >>> Asking for quorum_commit = 3 would cover that requirement.
>> >>>
>> >>> Not exactly as requested,
>>
>> >> That's just not the same thing.
>> >
>> > In what important ways does it differ?
>>
>> When you have one server functioning at each site you'll block until
>> you get a third machine back, rather than replicating to both sites
>> and remaining functional.
>
> And that is so important a consideration that you would like to move
> from one parameter in one file to a whole set of parameters, set
> differently in 5 separate files?

I don't accept that this is the trade-off being proposed. You seem
convinced that having the config all in one place on the master is
going to make things much more complicated, but I can't see why.

> Is it a common use case that people
> have more than 3 separate servers for one application, which is where
> the difference shows itself.

Much of the engineering we are doing centers around use cases that are
considerably more complex than what most people will do in real life.

> Another check: does specifying replication by server in such detail mean
> we can't specify robustness at the transaction level? If we gave up that
> feature, it would be a great loss for performance tuning.

No, I don't think it means that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Josh Berkus <josh(at)agliodbs(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 14:46:51
Message-ID: 1286290011.2025.1465.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2010-10-05 at 10:41 -0400, Robert Haas wrote:
> >>
> >> When you have one server functioning at each site you'll block until
> >> you get a third machine back, rather than replicating to both sites
> >> and remaining functional.
> >
> > And that is so important a consideration that you would like to move
> > from one parameter in one file to a whole set of parameters, set
> > differently in 5 separate files?
>
> I don't accept that this is the trade-off being proposed. You seem
> convinced that having the config all in one place on the master is
> going to make things much more complicated, but I can't see why.

But it is not "all in one place" because the file needs to be different
on 5 separate nodes. Which *does* make it more complicated than the
alternative is a single parameter, set the same everywhere.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Simon Riggs" <simon(at)2ndQuadrant(dot)com>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, "Markus Wanner" <markus(at)bluegap(dot)ch>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Dimitri Fontaine" <dfontaine(at)hi-media(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 14:56:57
Message-ID: 4CAAF6690200002500036516@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:

> Is it a common use case that people have more than 3 separate
> servers for one application, which is where the difference shows
> itself.

I don't know how common it is, but we replicate circuit court data
to two machines each at two sites. That way a disaster which took
out one building would leave us with the ability to run from the
other building and still take a machine out of the production mix
for scheduled maintenance or to survive a single-server failure at
the other site. Of course, there's no way we would make that
replication synchronous, and we're replicating from dozens of source
machines -- so I don't know if you can even count our configuration.

Still, the fact that we're replicating to two machines each at two
sites and that is the same example which came to mind for Robert,
suggests that perhaps it isn't *that* bizarre.

-Kevin


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 15:03:09
Message-ID: 4CAB3E2D.8000108@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> Another check: does specifying replication by server in such detail mean
> we can't specify robustness at the transaction level? If we gave up that
> feature, it would be a greatloss for performance tuning.

It's orthagonal. The kinds of configurations we're talking about simply
define what it will mean when you commit a transaction "with synch".

However, I think we're getting way the heck away from how far we really
want to go for 9.1. Can I point out to people that synch rep is going
to involve a fair bit of testing and debugging, and that maybe we don't
want to try to implement The World's Most Configurable Standby Spec as a
first step?

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 15:32:59
Message-ID: 1286292779.2025.1586.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2010-10-05 at 09:56 -0500, Kevin Grittner wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
>
> > Is it a common use case that people have more than 3 separate
> > servers for one application, which is where the difference shows
> > itself.
>
> I don't know how common it is, but we replicate circuit court data
> to two machines each at two sites. That way a disaster which took
> out one building would leave us with the ability to run from the
> other building and still take a machine out of the production mix
> for scheduled maintenance or to survive a single-server failure at
> the other site. Of course, there's no way we would make that
> replication synchronous, and we're replicating from dozens of source
> machines -- so I don't know if you can even count our configuration.
>
> Still, the fact that we're replicating to two machines each at two
> sites and that is the same example which came to mind for Robert,
> suggests that perhaps it isn't *that* bizarre.

Hoping that you mean "bizarre" as "less common". I don't find Robert's
example in any way strange and respect his viewpoint.

I am looking for ways to simplify the specification so that we aren't
burdened with a level of complexity we can avoid in the majority if
cases. If we only need complex configuration to support a small minority
of cases, then I'd say we don't need that (yet). Adding that support
later will make it clearer what the additional cost/benefit is.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Josh Berkus <josh(at)agliodbs(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 15:46:29
Message-ID: AANLkTikybXnhevdfvt+krYAkra1ejFO8bqSZUUE7HKoX@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Oct 5, 2010 at 10:46 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Tue, 2010-10-05 at 10:41 -0400, Robert Haas wrote:
>> >>
>> >> When you have one server functioning at each site you'll block until
>> >> you get a third machine back, rather than replicating to both sites
>> >> and remaining functional.
>> >
>> > And that is so important a consideration that you would like to move
>> > from one parameter in one file to a whole set of parameters, set
>> > differently in 5 separate files?
>>
>> I don't accept that this is the trade-off being proposed.  You seem
>> convinced that having the config all in one place on the master is
>> going to make things much more complicated, but I can't see why.
>
> But it is not "all in one place" because the file needs to be different
> on 5 separate nodes. Which *does* make it more complicated than the
> alternative is a single parameter, set the same everywhere.

Well, you only need to have the file at all on nodes you want to fail
over to. And aren't you going to end up rejiggering the config when
you fail over anyway, based on what happened? I mean, suppose you
have three servers and you require sync rep to 2 slaves. If the
master falls over and dies, it seems likely you're going to want to
relax that restriction. Or suppose you have three servers and you
require sync rep to 1 slave. The first time you fail over, you're
going to probably want to leave that config as-is, but if you fail
over again, you're very likely going to want to change it.

This is really the key question for me. If distributing the
configuration throughout the cluster meant that we could just fail
over and keep on trucking, that would be, well, really neat, and a
very compelling argument for the design you are proposing. But since
that seems impossible to me, I'm arguing for centralizing the
configuration file for ease of management.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Josh Berkus <josh(at)agliodbs(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 16:40:55
Message-ID: 1286296855.2025.1737.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2010-10-05 at 11:46 -0400, Robert Haas wrote:
> On Tue, Oct 5, 2010 at 10:46 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > On Tue, 2010-10-05 at 10:41 -0400, Robert Haas wrote:
> >> >>
> >> >> When you have one server functioning at each site you'll block until
> >> >> you get a third machine back, rather than replicating to both sites
> >> >> and remaining functional.
> >> >
> >> > And that is so important a consideration that you would like to move
> >> > from one parameter in one file to a whole set of parameters, set
> >> > differently in 5 separate files?
> >>
> >> I don't accept that this is the trade-off being proposed. You seem
> >> convinced that having the config all in one place on the master is
> >> going to make things much more complicated, but I can't see why.
> >
> > But it is not "all in one place" because the file needs to be different
> > on 5 separate nodes. Which *does* make it more complicated than the
> > alternative is a single parameter, set the same everywhere.
>
> Well, you only need to have the file at all on nodes you want to fail
> over to. And aren't you going to end up rejiggering the config when
> you fail over anyway, based on what happened? I mean, suppose you
> have three servers and you require sync rep to 2 slaves. If the
> master falls over and dies, it seems likely you're going to want to
> relax that restriction. Or suppose you have three servers and you
> require sync rep to 1 slave. The first time you fail over, you're
> going to probably want to leave that config as-is, but if you fail
> over again, you're very likely going to want to change it.

Single failovers are common. Multiple failovers aren't. For me, the key
question is about what is the common case, not edge cases.

> This is really the key question for me. If distributing the
> configuration throughout the cluster meant that we could just fail
> over and keep on trucking, that would be, well, really neat, and a
> very compelling argument for the design you are proposing.

Good, thanks.

The important thing is in the minutes and hours immediately after
failover it will all still work; there is no need to change to a
different and very likely untested config.

If you configure N+1 or N+2 redundancy, we should assume that if you
lose a node you will be striving to quickly replace it rather than shrug
and say "you lose some". And note as well, that when you do add that
other node back in, you won't need to change the config back again
afterwards. It all just works and keeps working, so the DBA can spend
his time investigating the issue and seeing if they can get the original
master back up, not keeping one eye on the config files of the remaining
servers.

> But since
> that seems impossible to me, I'm arguing for centralizing the
> configuration file for ease of management.

You can't "centralize" something in 5 different places, at least not in
my understanding of the word.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Josh Berkus <josh(at)agliodbs(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 17:29:47
Message-ID: AANLkTi=CaAJhg5iuSkUzG=i7oOo-DZ96GJjDK1rZabkd@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Oct 5, 2010 at 12:40 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> Well, you only need to have the file at all on nodes you want to fail
>> over to.  And aren't you going to end up rejiggering the config when
>> you fail over anyway, based on what happened?  I mean, suppose you
>> have three servers and you require sync rep to 2 slaves.  If the
>> master falls over and dies, it seems likely you're going to want to
>> relax that restriction.  Or suppose you have three servers and you
>> require sync rep to 1 slave.  The first time you fail over, you're
>> going to probably want to leave that config as-is, but if you fail
>> over again, you're very likely going to want to change it.
>
> Single failovers are common. Multiple failovers aren't. For me, the key
> question is about what is the common case, not edge cases.

Hmm. But even in the single failover cases, it's very possible that
you might want to make a change. If you have two machines replicating
synchronously to each other in wait-forever and one of them goes down,
you're probably going to want to bring the other one up in
don't-wait-forever mode. Or to take a slightly more complex example,
suppose you have two fast machines and a slow machine. As long as
both fast machines are up, one will be the master and the other its
synchronous slave; the slow machine will be a reporting server. But
if one of the fast machines dies, we might then want to make the slow
machine a synchronous slave just to make sure that our data remains
absolutely safe, even though it costs us some performance.

Using quorum_commit as a way to allow failover to happen and things to
keep humming along without configuration changes is a pretty clever
idea, but I think it only works in fairly specific cases. For
example, the "three equal machines, sync me to one of the other two"
case is pretty slick, at least so long as you don't have more than one
failure. I really can't improve on your design for that case; I'm not
sure there's any improvement to be had. But I don't think your design
fits nearly as well in cases where the slaves aren't all equal, which
I actually think will be more common than not.

>> But since
>> that seems impossible to me, I'm arguing for centralizing the
>> configuration file for ease of management.
>
> You can't "centralize" something in 5 different places, at least not in
> my understanding of the word.

Every design we're talking about involves at least some configuration
on every machine in the cluster, AFAICS. The no registration / quorum
commit solution sets the synchronization level and # of votes for each
standby on that standby, at least AIUI. The registration solution
sets that stuff (and maybe other things, like a per-standby
wal_keep_segments) on the master, and the standby just provides a
name.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Josh Berkus <josh(at)agliodbs(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 18:30:34
Message-ID: 1286303434.2025.2074.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2010-10-05 at 10:41 -0400, Robert Haas wrote:

> Much of the engineering we are doing centers around use cases that are
> considerably more complex than what most people will do in real life.

Why are we doing it then?

What I have proposed behaves identically to Oracle Maximum Availability
mode. Though I have extended it with per-transaction settings and have
been able to achieve that with fewer parameters as well. Most
importantly, those settings need not change following failover.

The proposed "standby.conf" registration scheme is *stricter* than
Oracle's Maximum Availability mode, yet uses an almost identical
parameter framework. The behaviour is not useful for the majority of
production databases.

Requesting sync against *all* standbys is stricter even than the highest
level of Oracle: Maximum Protection. Why do we think we need a level of
strictness higher than Oracle's maximum level? And in the first release?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Josh Berkus <josh(at)agliodbs(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-05 19:08:31
Message-ID: AANLkTinexwAJx=ALKXV2Pr_7HOotbDHOsyJh9bfZzcNT@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Oct 5, 2010 at 2:30 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Tue, 2010-10-05 at 10:41 -0400, Robert Haas wrote:
>> Much of the engineering we are doing centers around use cases that are
>> considerably more complex than what most people will do in real life.
>
> Why are we doing it then?

Because some people will, and whatever architecture we pick now will
be with us for a very long time. We needn't implement everything in
the first version, but we should try to avoid inextensible design
choices.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-06 16:26:22
Message-ID: 4CACA32E.2020206@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus wrote:
> However, I think we're getting way the heck away from how far we
> really want to go for 9.1. Can I point out to people that synch rep
> is going to involve a fair bit of testing and debugging, and that
> maybe we don't want to try to implement The World's Most Configurable
> Standby Spec as a first step?

I came up with the following initial spec for Most Configurable Standby
Setup Ever recently:

-The state of all available standby systems is exposed via a table-like
interface, probably an SRF.
-As each standby reports back a result, its entry in the table is
updated with what level of commit it has accomplished (recv, fsync, etc.)
-The table-like list of standby states is then passed to a function,
that you could write in SQL or whatever else makes you happy. The
function returns a boolean for whether sufficient commit guarantees have
been met yet. You can make the conditions required as complicated as
you like.
-Once that function returns true, commit on the master. Otherwise
return to waiting for standby responses.

So that's what I actually want here, because all subsets of it proposed
so are way too boring. If you cannot express every possible standby
situation that anyone will ever think of via an arbitrary function hook,
obviously it's not worth building at all.

Now, the more relevant question, what I actually need in order for a
Sync Rep feature in 9.1 to be useful to the people who want it most I
talk to. That would be a simple to configure setup where I list a
subset of "important" nodes, and the appropriate acknowledgement level I
want to hear from one of them. And when one of those nodes gives that
acknowledgement, commit on the master happens too. That's it. For use
cases like the commonly discussed "two local/two remote" situation, the
two remote ones would be listed as the important ones.

Until something that simple is committed, tested, debugged, and had some
run-ins with the real world, I have minimal faith that an attempt to
anything more complicated has sufficient information to succeed. And
complete faith that even trying will fail to deliver something for 9.1.
The scope creep that seems to be happening here in the name of "this
will be hard to change so it must be right in the first version" boggles
my mind.

--
Greg Smith, 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-06 16:50:29
Message-ID: AANLkTinSNgtCA5-1CRM3jXNQ7M2wyvZ0e984YHjsqecN@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 6, 2010 at 12:26 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> Now, the more relevant question, what I actually need in order for a Sync
> Rep feature in 9.1 to be useful to the people who want it most I talk to.
>  That would be a simple to configure setup where I list a subset of
> "important" nodes, and the appropriate acknowledgement level I want to hear
> from one of them.  And when one of those nodes gives that acknowledgement,
> commit on the master happens too.  That's it.  For use cases like the
> commonly discussed "two local/two remote" situation, the two remote ones
> would be listed as the important ones.

That sounds fine to me. How do the details work? Each slave
publishes a name to the master via a recovery.conf parameter, and the
master has a GUC listing the names of the important slaves?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-07 17:27:38
Message-ID: 4CAE030A.2060701@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 06.10.2010 19:26, Greg Smith wrote:
> Now, the more relevant question, what I actually need in order for a
> Sync Rep feature in 9.1 to be useful to the people who want it most I
> talk to. That would be a simple to configure setup where I list a subset
> of "important" nodes, and the appropriate acknowledgement level I want
> to hear from one of them. And when one of those nodes gives that
> acknowledgement, commit on the master happens too. That's it. For use
> cases like the commonly discussed "two local/two remote" situation, the
> two remote ones would be listed as the important ones.

This feels like the best way forward to me. It gives some flexibility,
and doesn't need a new config file.

Let me check that I got this right, and add some details to make it more
concrete: Each standby is given a name. It can be something like
"boston1" or "testserver". It does *not* have to be unique across all
standby servers. In the master, you have a list of important,
synchronous, nodes that must acknowledge each commit before it is
acknowledged to the client.

The standby name is a GUC in the standby's configuration file:

standby_name='bostonserver'

The list of important nodes is also a GUC, in the master's configuration
file:

synchronous_standbys='bostonserver, oxfordserver'

To configure for a simple setup with a master and one synchronous
standby (which is not a very good setup from availability point of view,
as discussed to death), you give the standby a name, and put the same
name in synchronous_standbys in the master.

To configure a setup with a master and two standbys, so that a commit is
acknowledged to client as soon as either one of the standbys acknowledge
it, you give both standbys the same name, and the same name in
synchronous_standbys list in the master. This is the configuration that
gives zero data loss in case one server fails, but also caters for
availability because you don't need to halt the master if one standby fails.

To configure a setup with a master and two standbys, so that a commit is
acknowledged to client after *both* standbys acknowledge it, you give
both standbys a different name, and list both names in
synchronous_standbys_list in the master.

I believe this will bend to most real life scenarios people have.

Now, the other big fight is over "wait forever" vs "timeout".
Personally, I'm stand firmly in the "wait forever" camp - you're nuts if
you want a timeout. However, I can see that not everyone agrees :-).
Fortunately, once we have robust "wait forever" behavior, it shouldn't
be hard at all to add a timeout option on top of that, for those who
want it. We should be able to have both options in 9.1.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Dave Page <dpage(at)pgadmin(dot)org>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-07 17:39:55
Message-ID: AANLkTimfmCGS7yrkksVKr_104773v8yxTsmPr4K1dG+B@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/7/10, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> On 06.10.2010 19:26, Greg Smith wrote:
>> Now, the more relevant question, what I actually need in order for a
>> Sync Rep feature in 9.1 to be useful to the people who want it most I
>> talk to. That would be a simple to configure setup where I list a subset
>> of "important" nodes, and the appropriate acknowledgement level I want
>> to hear from one of them. And when one of those nodes gives that
>> acknowledgement, commit on the master happens too. That's it. For use
>> cases like the commonly discussed "two local/two remote" situation, the
>> two remote ones would be listed as the important ones.
>
> This feels like the best way forward to me. It gives some flexibility,
> and doesn't need a new config file.
>
> Let me check that I got this right, and add some details to make it more
> concrete: Each standby is given a name. It can be something like
> "boston1" or "testserver". It does *not* have to be unique across all
> standby servers. In the master, you have a list of important,
> synchronous, nodes that must acknowledge each commit before it is
> acknowledged to the client.
>
> The standby name is a GUC in the standby's configuration file:
>
> standby_name='bostonserver'
>
> The list of important nodes is also a GUC, in the master's configuration
> file:
>
> synchronous_standbys='bostonserver, oxfordserver'
>
> To configure for a simple setup with a master and one synchronous
> standby (which is not a very good setup from availability point of view,
> as discussed to death), you give the standby a name, and put the same
> name in synchronous_standbys in the master.
>
> To configure a setup with a master and two standbys, so that a commit is
> acknowledged to client as soon as either one of the standbys acknowledge
> it, you give both standbys the same name, and the same name in
> synchronous_standbys list in the master. This is the configuration that
> gives zero data loss in case one server fails, but also caters for
> availability because you don't need to halt the master if one standby fails.
>
> To configure a setup with a master and two standbys, so that a commit is
> acknowledged to client after *both* standbys acknowledge it, you give
> both standbys a different name, and list both names in
> synchronous_standbys_list in the master.
>
> I believe this will bend to most real life scenarios people have.

+1. I think this would have met any needs of mine in my past life as a
sysadmin/dba.

>
> Now, the other big fight is over "wait forever" vs "timeout".
> Personally, I'm stand firmly in the "wait forever" camp - you're nuts if
> you want a timeout. However, I can see that not everyone agrees :-).
> Fortunately, once we have robust "wait forever" behavior, it shouldn't
> be hard at all to add a timeout option on top of that, for those who
> want it. We should be able to have both options
>

I disagree that you're nuts if you want this feature fwiw. +1 on your
suggested plan though :-)

/D


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-07 17:45:29
Message-ID: 4CAE0739.9030904@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/7/10 10:27 AM, Heikki Linnakangas wrote:
> The standby name is a GUC in the standby's configuration file:
>
> standby_name='bostonserver'
>
> The list of important nodes is also a GUC, in the master's configuration
> file:
>
> synchronous_standbys='bostonserver, oxfordserver'

This seems to abandon Simon's concept of per-transaction synchronization
control. That seems like such a potentially useful feature that I'm
reluctant to abandon it just for administrative elegance.

Does this work together with that in some way I can't see?

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-07 17:55:25
Message-ID: AANLkTi=-s2_FTz+JRwoJqZKCcfdvE=G=cvmVK4-7irtU@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 7, 2010 at 1:39 PM, Dave Page <dpage(at)pgadmin(dot)org> wrote:
> On 10/7/10, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> On 06.10.2010 19:26, Greg Smith wrote:
>>> Now, the more relevant question, what I actually need in order for a
>>> Sync Rep feature in 9.1 to be useful to the people who want it most I
>>> talk to. That would be a simple to configure setup where I list a subset
>>> of "important" nodes, and the appropriate acknowledgement level I want
>>> to hear from one of them. And when one of those nodes gives that
>>> acknowledgement, commit on the master happens too. That's it. For use
>>> cases like the commonly discussed "two local/two remote" situation, the
>>> two remote ones would be listed as the important ones.
>>
>> This feels like the best way forward to me. It gives some flexibility,
>> and doesn't need a new config file.
>>
>> Let me check that I got this right, and add some details to make it more
>> concrete: Each standby is given a name. It can be something like
>> "boston1" or "testserver". It does *not* have to be unique across all
>> standby servers. In the master, you have a list of important,
>> synchronous, nodes that must acknowledge each commit before it is
>> acknowledged to the client.
>>
>> The standby name is a GUC in the standby's configuration file:
>>
>> standby_name='bostonserver'
>>
>> The list of important nodes is also a GUC, in the master's configuration
>> file:
>>
>> synchronous_standbys='bostonserver, oxfordserver'
>>
>> To configure for a simple setup with a master and one synchronous
>> standby (which is not a very good setup from availability point of view,
>> as discussed to death), you give the standby a name, and put the same
>> name in synchronous_standbys in the master.
>>
>> To configure a setup with a master and two standbys, so that a commit is
>> acknowledged to client as soon as either one of the standbys acknowledge
>> it, you give both standbys the same name, and the same name in
>> synchronous_standbys list in the master. This is the configuration that
>> gives zero data loss in case one server fails, but also caters for
>> availability because you don't need to halt the master if one standby fails.
>>
>> To configure a setup with a master and two standbys, so that a commit is
>> acknowledged to client after *both* standbys acknowledge it, you give
>> both standbys a different name, and list both names in
>> synchronous_standbys_list in the master.
>>
>> I believe this will bend to most real life scenarios people have.
>
> +1. I think this would have met any needs of mine in my past life as a
> sysadmin/dba.

Before we get too far down the garden path here, this is actually
substantially more complicated than what Greg proposed. Greg was
proposing, as have some other folks I think, to focus only on the k=1
case - in other words, only one acknowledgment would ever be required
for any given commit. I think he's right to focus on that case,
because the multiple-ACKs-required solutions are quite a bit hairier.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-07 17:57:23
Message-ID: AANLkTimdPzP6sL_fLiXBkXC6f0YambkDb7RF7xYnhJcB@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 7, 2010 at 1:45 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 10/7/10 10:27 AM, Heikki Linnakangas wrote:
>> The standby name is a GUC in the standby's configuration file:
>>
>> standby_name='bostonserver'
>>
>> The list of important nodes is also a GUC, in the master's configuration
>> file:
>>
>> synchronous_standbys='bostonserver, oxfordserver'
>
> This seems to abandon Simon's concept of per-transaction synchronization
> control.  That seems like such a potentially useful feature that I'm
> reluctant to abandon it just for administrative elegance.
>
> Does this work together with that in some way I can't see?

I think they work together fine. Greg's idea is that you list the
important standbys, and a synchronization guarantee that you'd like to
have for at least one of them. Simon's idea - at least at 10,000 feet
- is that you can take a pass on that guarantee for transactions that
don't need it. I don't see why you can't have both.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-07 17:59:23
Message-ID: AANLkTinhDMaWs1+YyZca8M19nizbLdYovCq+QeqAKiHv@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 7, 2010 at 1:27 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:

> Let me check that I got this right, and add some details to make it more
> concrete: Each standby is given a name. It can be something like "boston1"
> or "testserver". It does *not* have to be unique across all standby servers.
> In the master, you have a list of important, synchronous, nodes that must
> acknowledge each commit before it is acknowledged to the client.
>
> The standby name is a GUC in the standby's configuration file:
>
> standby_name='bostonserver'
>
> The list of important nodes is also a GUC, in the master's configuration
> file:
>
> synchronous_standbys='bostonserver, oxfordserver'

+1.

It definitely covers the scenarios I want.

And even allows the ones I don't want, and don't understand either ;-)

I and personally, I'ld *love* it if the streaming replication protocol
was adjusted to that every streaming WAL client reported back their
role and recive/fsync/replay positions as part of the protocol
(allowing role and positions to be something "NULL"able/empty/0). I
think Simon demonstrated that the overhead to report it isn't high.
Again, in the deployments I'm wanting, the "slave" isn't a PG server,
but something like Magnus's stream-to-archive, so I can't query the
slave to see how far behind it is.

a.

--
Aidan Van Dyk                                             Create like a god,
aidan(at)highrise(dot)ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-07 18:33:03
Message-ID: 4CAE125F.2030204@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> I think they work together fine. Greg's idea is that you list the
> important standbys, and a synchronization guarantee that you'd like to
> have for at least one of them. Simon's idea - at least at 10,000 feet
> - is that you can take a pass on that guarantee for transactions that
> don't need it. I don't see why you can't have both.

So, two things:

1) This version of Standby Registration seems to add One More Damn Place
You Need To Configure Standby (OMDPYNTCS) without adding any
functionality you couldn't get *without* having a list on the master.
Can someone explain to me what functionality is added by this approach
vs. not having a list on the master at all?

2) I see Simon's approach where you can designate not just synch/asynch,
but synch *mode* per session to be valuable. I can imagine having
transactions I just want to "ack" vs. transactions I want to "apply"
according to application logic (e.g. customer personal information vs.
financial transactions). This approach would still seem to remove that
functionality. Does it?

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-07 19:18:01
Message-ID: AANLkTi=U_p9-JK6tqoVucCMErO6ROS+LSm7T6ExQeai0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 7, 2010 at 2:33 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> I think they work together fine.  Greg's idea is that you list the
>> important standbys, and a synchronization guarantee that you'd like to
>> have for at least one of them.  Simon's idea - at least at 10,000 feet
>> - is that you can take a pass on that guarantee for transactions that
>> don't need it.  I don't see why you can't have both.
>
> So, two things:
>
> 1) This version of Standby Registration seems to add One More Damn Place
> You Need To Configure Standby (OMDPYNTCS) without adding any
> functionality you couldn't get *without* having a list on the master.
> Can someone explain to me what functionality is added by this approach
> vs. not having a list on the master at all?

Well, then you couldn't have one strictly synchronous standby and one
asynchronous standby.

> 2) I see Simon's approach where you can designate not just synch/asynch,
> but synch *mode* per session to be valuable.  I can imagine having
> transactions I just want to "ack" vs. transactions I want to "apply"
> according to application logic (e.g. customer personal information vs.
> financial transactions).  This approach would still seem to remove that
> functionality.  Does it?

I'm not totally sure. I think we could probably avoid removing that
with careful detailed design.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-07 20:56:14
Message-ID: AANLkTinrPNM3-CboCVGog_i6u5ZKFf+Oj7Bdc5AGRpwD@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 7, 2010 at 10:27 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> The standby name is a GUC in the standby's configuration file:
>
> standby_name='bostonserver'
>

Fwiw I was hoping it would be possible to set every machine up with an
identical postgresql.conf file. That doesn't preclude this idea since
you could start up your server with a script that sets the GUC on the
command-line and that script could use whatever it wants to look up
its name such as using its hardware info to look it up in a database.
But just something to keep in mind.

In particular I would want to be able to configure everything
identically and then have each node run some kind of program which
determines its name and position in the replication structure. This
implies that each node given its identity and the total view of the
structure can figure out what it should be doing including whether to
be read-only or read-write, who to contact as its master, and whether
to listen from slaves.

If every node needs a configuration file specifying multiple
interdependent variables which are all different from server to server
it'll be too hard to keep them all in sync. I would rather tell every
node, "here's how to push to the archive, here's how to pull, here's
the whole master-slave structure even the parts you don't need to know
about and the redundant entry for yourself -- now here's your name go
figure out whether to push or pull and from where"

--
greg


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-07 23:15:00
Message-ID: 4CAE5474.8030906@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus wrote:
> This version of Standby Registration seems to add One More Damn Place
> You Need To Configure Standby (OMDPYNTCS) without adding any
> functionality you couldn't get *without* having a list on the master.
> Can someone explain to me what functionality is added by this approach
> vs. not having a list on the master at all?
>

That little design outline I threw out there wasn't intended to be a
plan for right way to proceed here. What I was trying to do is point
out the minimum needed that would actually work for the use cases people
want the most, to shift discussion back toward simpler rather than more
complex configurations. If a more dynamic standby registration
procedure can get developed on schedule that's superior to that, great.
I think it really doesn't have to offer anything above automating what I
outlined to be considered good enough initially though.

And if the choice is between the stupid simple OMDPYNTCS idea I threw
out and demanding a design too complicated to deliver in 9.1, I'm quite
sure I'd rather have the hard to configure version that ships. Things
like keeping the master from having a hard-coded list of nodes and
making it easy for every node to have an identical postgresql.conf are
all great goals, but are also completely optional things for a first
release from where I'm standing. If a patch without any complicated
registration stuff got committed tomorrow, and promises to add better
registration on top of it in the next CommitFest didn't deliver, the
project would still be able to announce "Sync Rep is here in 9.1" in a
way people could and would use. We wouldn't be proud of the UI, but
that's normal in a "release early, release often" world.

The parts that scare me about sync rep are not in how to configure it,
it's in how it will break in completely unexpected ways related to the
communications protocol. And to even begin exploring that fully,
something simple has to actually get committed, so that there's a solid
target to kick off organized testing against. That's the point I'm
concerned about reaching as soon as feasible. And if takes massive cuts
in the flexibility or easy of configuration to get there quickly, so
long as it doesn't actually hamper the core operating set here I would
consider that a good trade.

--
Greg Smith, 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-08 01:38:44
Message-ID: AANLkTi=JM+oQJWqhkhUj-a1PcDiZiMFtko_7MrEbemKC@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 7, 2010 at 7:15 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> Josh Berkus wrote:
>>
>> This version of Standby Registration seems to add One More Damn Place
>> You Need To Configure Standby (OMDPYNTCS) without adding any
>> functionality you couldn't get *without* having a list on the master.
>> Can someone explain to me what functionality is added by this approach
>> vs. not having a list on the master at all?
>>
>
> That little design outline I threw out there wasn't intended to be a plan
> for right way to proceed here.  What I was trying to do is point out the
> minimum needed that would actually work for the use cases people want the
> most, to shift discussion back toward simpler rather than more complex
> configurations.  If a more dynamic standby registration procedure can get
> developed on schedule that's superior to that, great.  I think it really
> doesn't have to offer anything above automating what I outlined to be
> considered good enough initially though.
>
> And if the choice is between the stupid simple OMDPYNTCS idea I threw out
> and demanding a design too complicated to deliver in 9.1, I'm quite sure I'd
> rather have the hard to configure version that ships.  Things like keeping
> the master from having a hard-coded list of nodes and making it easy for
> every node to have an identical postgresql.conf are all great goals, but are
> also completely optional things for a first release from where I'm standing.
>  If a patch without any complicated registration stuff got committed
> tomorrow, and promises to add better registration on top of it in the next
> CommitFest didn't deliver, the project would still be able to announce "Sync
> Rep is here in 9.1" in a way people could and would use.  We wouldn't be
> proud of the UI, but that's normal in a "release early, release often"
> world.
>
> The parts that scare me about sync rep are not in how to configure it, it's
> in how it will break in completely unexpected ways related to the
> communications protocol.  And to even begin exploring that fully, something
> simple has to actually get committed, so that there's a solid target to kick
> off organized testing against.  That's the point I'm concerned about
> reaching as soon as feasible.  And if takes massive cuts in the flexibility
> or easy of configuration to get there quickly, so long as it doesn't
> actually hamper the core operating set here I would consider that a good
> trade.

Yes, let's please just implement something simple and get it
committed. k = 1. Two GUCs (synchronous_standbys = name, name, name
and synchronous_waitfor = none|recv|fsync|apply), SUSET so you can
change it per txn. Done. We can revise it *the day after it's
committed* if we agree on how. And if we *don't* agree, then we can
ship it and we still win.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-08 05:43:40
Message-ID: 4CAEAF8C.1070903@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 07.10.2010 23:56, Greg Stark wrote:
> On Thu, Oct 7, 2010 at 10:27 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> The standby name is a GUC in the standby's configuration file:
>>
>> standby_name='bostonserver'
>>
>
> Fwiw I was hoping it would be possible to set every machine up with an
> identical postgresql.conf file.

This proposal allows that. At least assuming you have a simple setup of
one master and N standbys, and you're happy with a reply from any
standby, as opposed to all standbys. You just set both standby_name and
synchronous_standby GUCS to 'foo' in all servers, and you're done.

You'll need to point each standby's primary_conninfo setting to the
current master, though, but that's no different from the situation today
with asynchronous replication. Presumably you'll have a virtual IP
address or host name that always points to the current master, also used
by the actual applications connecting to the database.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-08 05:48:18
Message-ID: 4CAEB0A2.8090704@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 07.10.2010 21:33, Josh Berkus wrote:
> 1) This version of Standby Registration seems to add One More Damn Place
> You Need To Configure Standby (OMDPYNTCS) without adding any
> functionality you couldn't get *without* having a list on the master.
> Can someone explain to me what functionality is added by this approach
> vs. not having a list on the master at all?

It's just one GUC. Without the list, there would have to be at least a
boolean option to enable/disable it.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-08 06:52:16
Message-ID: AANLkTimLayAEnqvKscugKmiSBEyE5T3Td63N5BSZ6Y=_@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Oct 8, 2010 at 10:38 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Yes, let's please just implement something simple and get it
> committed.  k = 1.  Two GUCs (synchronous_standbys = name, name, name
> and synchronous_waitfor = none|recv|fsync|apply)

For my cases, I'm OK with this as the first commit, for now.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-08 08:29:56
Message-ID: 4CAED684.5070304@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas wrote:
> Yes, let's please just implement something simple and get it
> committed. k = 1. Two GUCs (synchronous_standbys = name, name, name
> and synchronous_waitfor = none|recv|fsync|apply), SUSET so you can
> change it per txn. Done. We can revise it *the day after it's
> committed* if we agree on how. And if we *don't* agree, then we can
> ship it and we still win.
>
I like the idea of something simple committed first, and am trying to
understand what's said above.

k = 1 : wait for only one ack
two gucs: does this mean configurable in postgresql.conf at the master,
and changable with SET commands on the master depending on options? Are
both gucs mutable?
synchronous_standbys: I'm wondering if this registration is necessary in
this simple setup. What are the named used for? Could they be removed?
Should they also be configured at each standby?
synchronous_waitfor: If configured on the master, how is it updated to
the standbys? What does being able to configure 'none' mean? k = 0? I
smell a POLA violation here.

regards
Yeb Havinga


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Yeb Havinga <yebhavinga(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-08 13:06:10
Message-ID: AANLkTinYoO9Li32oEMRppKOxWrP0D2kCAC+smJr_cXSj@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Oct 8, 2010 at 4:29 AM, Yeb Havinga <yebhavinga(at)gmail(dot)com> wrote:
> Robert Haas wrote:
>>
>> Yes, let's please just implement something simple and get it
>> committed.  k = 1.  Two GUCs (synchronous_standbys = name, name, name
>> and synchronous_waitfor = none|recv|fsync|apply), SUSET so you can
>> change it per txn.  Done.  We can revise it *the day after it's
>> committed* if we agree on how.  And if we *don't* agree, then we can
>> ship it and we still win.
>>
>
> I like the idea of something simple committed first, and am trying to
> understand what's said above.
>
> k = 1 : wait for only one ack
> two gucs: does this mean configurable in postgresql.conf at the master, and
> changable with SET commands on the master depending on options? Are both
> gucs mutable?
> synchronous_standbys: I'm wondering if this registration is necessary in
> this simple setup. What are the named used for? Could they be removed?
> Should they also be configured at each standby?
> synchronous_waitfor: If configured on the master, how is it updated to the
> standbys? What does being able to configure 'none' mean? k = 0? I smell a
> POLA violation here.

Well, there's got to be some way to turn synchronous replication off.
The obvious methods are to allow synchronous_standbys to be set to
empty or to allow synchronous_waitfor to be set to none.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-08 16:29:59
Message-ID: 4CAF4707.3040405@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/07/2010 06:38 PM, Robert Haas wrote:
> Yes, let's please just implement something simple and get it
> committed. k = 1. Two GUCs (synchronous_standbys = name, name, name
> and synchronous_waitfor = none|recv|fsync|apply), SUSET so you can
> change it per txn. Done. We can revise it *the day after it's
> committed* if we agree on how. And if we*don't* agree, then we can
> ship it and we still win.

If we have all this code, and it appears that we do, +1 to commit it now
so that we can start testing.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: standby registration (was: is sync rep stalled?)
Date: 2010-10-08 17:39:50
Message-ID: AANLkTi=uhmw60FFpF0vHjGo5o7wjaa5iNB0vmnK4e2BP@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Oct 8, 2010 at 12:29 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 10/07/2010 06:38 PM, Robert Haas wrote:
>>
>> Yes, let's please just implement something simple and get it
>> committed.  k = 1.  Two GUCs (synchronous_standbys = name, name, name
>> and synchronous_waitfor = none|recv|fsync|apply), SUSET so you can
>> change it per txn.  Done.  We can revise it *the day after it's
>> committed* if we agree on how.  And if we*don't*  agree, then we can
>> ship it and we still win.
>
> If we have all this code, and it appears that we do, +1 to commit it now so
> that we can start testing.

To the best of my knowledge we don't have exactly that thing, but it
seems like either of the two patches on the table could probably be
beaten into that shape with a large mallet in fairly short order, and
I think we should pick one of them and do just that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company