Re: replication docs: split single vs. multi-master

Lists: pgsql-hackerspgsql-patches
From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: pgsql-patches(at)postgresql(dot)org
Subject: replication docs: split single vs. multi-master
Date: 2006-11-15 10:43:27
Message-ID: 455AEF4F.3010304@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hi,

as promised on -docs, here comes my proposal on how to improve the
replication documentation. The patches are split as follows and have to
be applied in order:

replication_doku_1.diff:

Smallest possible one-word change to warm-up...

replication_doku_2.diff:

Moves down "Clustering For Parallel Query Execution", because
it's not a replication type, but a feature, see explanation below.

replication_doku_3.diff:

This is the most important part, splitting all replication types
into single- and multi-master replication. I'm new to SGML, so
please bear with me if this is not the right way to do it...

"Shared-Disk-Failover" does IMO not fall into a replication category.
Should we mention there, that 'sharing' a disk using NFS or some
such is not recommended? (And more importantly, does not work as
a multi-master replication solution)

I've added a general paragraph describing Single-Master Replication.
I'm stating that 'Single-Master Replication is always asynchronous'.
Can anybody think of a counter example? Or a use case for sync
Single-Master Replication? The argument to put down is: if you go
sync, why don't you do Multi-Master right away?

Most of the "Clustering for Load Balancing" text applies to all
synchronous, Multi-Master Replication algorithms, even to
"Query Broadcasting". Thus it became the general description
of Multi-Master Replication. The section "Clustering for
Load Balancing" has been removed.

replication_doku_4.diff:

These are the text modifications I did to adjust to the new structure.
I've adjusted the Multi-Master Replication text to really be
appropriate for all existing solutions.

"Query Broadcasting" has some corrections, mainly to stick to describe
that algorithm there and none of the general properties of
Multi-Master Replication.

I've added two sections to describe 2PC and Distributed SHMEM
algorithms which belong into that category and cover all of the
previous text. Except that I've removed the mentioning of Oracle RAC
in favor of Pgpool-II.

IMO this makes it clearer, what replication types exist and how to
categorize them. I'm tempted to mention the Postgres-R algorithm as
fourth sub-section of Multi-Master Replication, as it's quite different
from all the others in many aspects. But I urgently need to do go to
work now... besides, I'm heavily biased regarding Postgres-R, so
probably someone else should write that paragraph. :-)

The only downside of the structure I'm proposing here is: the
non-replication-algorithms fall of somewhat. Namely: "Shared-Disk
Failover", "Data Partitioning", "Parallel Query Execution" and
"Commercial Solutions".

For me, "Data Partitioning" as well as "Parallel Query Execution" are
possible optimizations which can be run on top of replicated data. They
don't replicate data and are thus not replication solutions. But
grouping those two together would make sense.

So. I really have to go to work now!

Regards

Markus

Attachment Content-Type Size
replication_doku_1.diff text/x-patch 844 bytes
replication_doku_2.diff text/x-patch 3.3 KB
replication_doku_3.diff text/x-patch 4.9 KB
replication_doku_4.diff text/x-patch 5.3 KB

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Markus Schiltknecht <markus(at)bluegap(dot)ch>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: replication docs: split single vs. multi-master
Date: 2006-11-16 18:35:04
Message-ID: 200611161835.kAGIZ4t21504@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Markus Schiltknecht wrote:
> Hi,
>
> as promised on -docs, here comes my proposal on how to improve the
> replication documentation. The patches are split as follows and have to
> be applied in order:
>
> replication_doku_1.diff:
>
> Smallest possible one-word change to warm-up...

Done.

>
>
> replication_doku_2.diff:
>
> Moves down "Clustering For Parallel Query Execution", because
> it's not a replication type, but a feature, see explanation below.
>

Actually the patch moves down data paritioning. I am confused.

> replication_doku_3.diff:
>
> This is the most important part, splitting all replication types
> into single- and multi-master replication. I'm new to SGML, so
> please bear with me if this is not the right way to do it...
>
> "Shared-Disk-Failover" does IMO not fall into a replication category.
> Should we mention there, that 'sharing' a disk using NFS or some
> such is not recommended? (And more importantly, does not work as
> a multi-master replication solution)
>
> I've added a general paragraph describing Single-Master Replication.
> I'm stating that 'Single-Master Replication is always asynchronous'.
> Can anybody think of a counter example? Or a use case for sync
> Single-Master Replication? The argument to put down is: if you go
> sync, why don't you do Multi-Master right away?
>
> Most of the "Clustering for Load Balancing" text applies to all
> synchronous, Multi-Master Replication algorithms, even to
> "Query Broadcasting". Thus it became the general description
> of Multi-Master Replication. The section "Clustering for
> Load Balancing" has been removed.

I thought a long time about this. I have always liked splitting the
solutions up into single and multi-master, but in doing this
documentation section, I realized that the split isn't all that helpful,
and can be confusing. For example, Slony is clearly single-master, but
what about data partitioning? That is multi-master, in that there is
more than one master, but only one master per data set. And for
multi-master, Oracle RAC is clearly multi master, and I can see pgpool
as multi-master, or as several single-master systems, in that they
operate independently. After much thought, it seems that putting things
into single/multi-master categories just adds more confusion, because
several solutions just aren't clear, or fall into neither, e.g. Shared
Disk Failover. Another issue is that you mentioned heavly locking for
multi-master, when in fact pgpool doesn't do any special inter-server
locking, so it just doesn't apply.

In summary, it just seemed clearer to talk about each item and how it
works, rather than try to categorize them. The categorization just
seems to do more harm than good.

Of course, I might be totally wrong, and am still looking for feedback,
but these are my current thoughts. Feedback?

I didn't mention distributed shared memory as a separate item because I
felt it was an implementation detail of clustering, rather than
something separate. I kept two-phase in the cluster item for the same
reason.

Current version at:

http://momjian.us/main/writings/pgsql/sgml/failover.html

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: replication docs: split single vs. multi-master
Date: 2006-11-16 20:46:51
Message-ID: 455CCE3B.4080809@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hello Bruce,

Bruce Momjian wrote:
> Actually the patch moves down data paritioning. I am confused.

Uh.. yeah, sorry, that's what I meant.

> I thought a long time about this. I have always liked splitting the
> solutions up into single and multi-master, but in doing this
> documentation section, I realized that the split isn't all that helpful,
> and can be confusing.

Not mentioning that categorization doesn't help in clearing the
confusion. Just look around, most people use these terms. They're used
by MySQL and Oracle. Even Microsofts ActiveDirectory seems to have a
multi-master operation mode.

> For example, Slony is clearly single-master,

Agreed.

> but
> what about data partitioning? That is multi-master, in that there is
> more than one master, but only one master per data set.

Data Partitioning is a way to work around the trouble of database
replication in the application layer. Instead of trying to categorize it
like a replication algorithm, we should explain that working around the
trouble may be worthwhile in many cases.

> And for
> multi-master, Oracle RAC is clearly multi master,

Yes.

> and I can see pgpool
> as multi-master, or as several single-master systems, in that they
> operate independently.

Several single-master systems? C'mon! Pgpool simply implements the most
simplistic form of multi-master replication. Just because you can access
the single databases inside the cluster doesn't make it less
Multi-Master, does it?

> After much thought, it seems that putting things
> into single/multi-master categories just adds more confusion, because
> several solutions just aren't clear

Agreed, I'm not saying you must categorize all solutions you describe.
But please do categorize the ones which can be (and have so often been)
categorized.

> or fall into neither, e.g. Shared Disk Failover.

Oh, yes, this reminds me of Brad Nicholson's suggestion in [1] to add a
warning "about the risk of having two postmaster come up...".

What about other means of sharing disks or filesystems? NBDs or even
worse: NFS?

> Another issue is that you mentioned heavly locking for
> multi-master, when in fact pgpool doesn't do any special inter-server
> locking, so it just doesn't apply.

Sure it does apply, in the sense that *every* single lock is granted and
released on *every* node. The total amount of locks scales linearly with
the amount of nodes in the cluster.

> In summary, it just seemed clearer to talk about each item and how it
> works, rather than try to categorize them. The categorization just
> seems to do more harm than good.
>
> Of course, I might be totally wrong, and am still looking for feedback,
> but these are my current thoughts. Feedback?

AFAICT, the categorization in Single- and Multi-Master replication is
very common. I think that's partly because it's focused on the solution.
One can ask: do I want to write on all nodes or is a failover solution
sufficient? Or can I probably get away with a read-only Slave?

It's a categorization the user does, often before having a glimpse about
how complicated database replication really is. Thus, IMO, it would make
sense to help the user and allow him to quickly find answers. (And we
can still tell them that it's not easy or even possible to categorize
all the solutions.)

> I didn't mention distributed shared memory as a separate item because I
> felt it was an implementation detail of clustering, rather than
> something separate. I kept two-phase in the cluster item for the same
> reason.

Why is pgpool not an implementation detail of clustering, then?

> Current version at:
>
> http://momjian.us/main/writings/pgsql/sgml/failover.html

That somehow doesn't work for me:

--- momjian.us ping statistics ---
15 packets transmitted, 0 received, 100% packet loss, time 14011ms

Just my 2 cents, in the hope to be of help.

Regards

Markus

[1]: Brad Nicholson's suggestion:
http://archives.postgresql.org/pgsql-admin/2006-11/msg00154.php


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Markus Schiltknecht <markus(at)bluegap(dot)ch>, pgsql-patches(at)postgresql(dot)org
Subject: Re: replication docs: split single vs. multi-master
Date: 2006-11-16 21:50:31
Message-ID: 200611162150.kAGLoVa06062@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Bruce Momjian wrote:
> I didn't mention distributed shared memory as a separate item because I
> felt it was an implementation detail of clustering, rather than
> something separate. I kept two-phase in the cluster item for the same
> reason.
>
> Current version at:
>
> http://momjian.us/main/writings/pgsql/sgml/failover.html

I am now attaching the additional text I added based on your comments.
I have also changed the markup so all the solutions appear on the same
web page. I think seeing it all together might give us new ideas for
improvement.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachment Content-Type Size
/rtmp/diff text/x-diff 2.9 KB

From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: replication docs: split single vs. multi-master
Date: 2006-11-16 22:02:46
Message-ID: 455CE006.4050708@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Bruce Momjian wrote:
> I am now attaching the additional text I added based on your comments.
> I have also changed the markup so all the solutions appear on the same
> web page. I think seeing it all together might give us new ideas for
> improvement.

Good, it's definitely better to have it all on one page.

I just thought about the words 'master' and 'slave', which are
admittedly quite unfortunate. I remember reading about efforts to remove
them from geek-speech. They proposed to introduce better names. At least
with the old IDE drives, master- and slave-drives seem to disappear now...

Regards

Markus


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Markus Schiltknecht <markus(at)bluegap(dot)ch>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [PATCHES] replication docs: split single vs. multi-master
Date: 2006-11-17 05:01:06
Message-ID: 200611170501.kAH517P01649@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Markus Schiltknecht wrote:
> Not mentioning that categorization doesn't help in clearing the
> confusion. Just look around, most people use these terms. They're used
> by MySQL and Oracle. Even Microsofts ActiveDirectory seems to have a
> multi-master operation mode.

OK.

> > For example, Slony is clearly single-master,
>
> Agreed.
>
> > but
> > what about data partitioning? That is multi-master, in that there is
> > more than one master, but only one master per data set.
>
> Data Partitioning is a way to work around the trouble of database
> replication in the application layer. Instead of trying to categorize it
> like a replication algorithm, we should explain that working around the
> trouble may be worthwhile in many cases.

OK. I am still feeling that data partitioning is like master/slave
replication because you have to get that read-only copy to the other
server. If you split things up so data sets resided on only one
machine, you are right that would not be replication, but do people do
that? If so, it is almost another solution.

>
> > And for
> > multi-master, Oracle RAC is clearly multi master,
>
> Yes.
>
> > and I can see pgpool
> > as multi-master, or as several single-master systems, in that they
> > operate independently.
>
> Several single-master systems? C'mon! Pgpool simply implements the most
> simplistic form of multi-master replication. Just because you can access
> the single databases inside the cluster doesn't make it less
> Multi-Master, does it?

OK, changed to "Multi-Master Replication Using Query Broadcasting".

>
> > After much thought, it seems that putting things
> > into single/multi-master categories just adds more confusion, because
> > several solutions just aren't clear
>
> Agreed, I'm not saying you must categorize all solutions you describe.
> But please do categorize the ones which can be (and have so often been)
> categorized.

OK.

> > or fall into neither, e.g. Shared Disk Failover.
>
> Oh, yes, this reminds me of Brad Nicholson's suggestion in [1] to add a
> warning "about the risk of having two postmaster come up...".

Added.

>
> What about other means of sharing disks or filesystems? NBDs or even
> worse: NFS?

Added.

>
> > Another issue is that you mentioned heavly locking for
> > multi-master, when in fact pgpool doesn't do any special inter-server
> > locking, so it just doesn't apply.
>
> Sure it does apply, in the sense that *every* single lock is granted and
> released on *every* node. The total amount of locks scales linearly with
> the amount of nodes in the cluster.

Uh, but the locks are the same on each machine as if it was a single
server, while in a cluster, the locks are more intertwined with other
things that are happening on the server, no?

> > In summary, it just seemed clearer to talk about each item and how it
> > works, rather than try to categorize them. The categorization just
> > seems to do more harm than good.
> >
> > Of course, I might be totally wrong, and am still looking for feedback,
> > but these are my current thoughts. Feedback?
>
> AFAICT, the categorization in Single- and Multi-Master replication is
> very common. I think that's partly because it's focused on the solution.
> One can ask: do I want to write on all nodes or is a failover solution
> sufficient? Or can I probably get away with a read-only Slave?

OK.

> It's a categorization the user does, often before having a glimpse about
> how complicated database replication really is. Thus, IMO, it would make
> sense to help the user and allow him to quickly find answers. (And we
> can still tell them that it's not easy or even possible to categorize
> all the solutions.)
>
> > I didn't mention distributed shared memory as a separate item because I
> > felt it was an implementation detail of clustering, rather than
> > something separate. I kept two-phase in the cluster item for the same
> > reason.
>
> Why is pgpool not an implementation detail of clustering, then?
>
> > Current version at:
> >
> > http://momjian.us/main/writings/pgsql/sgml/failover.html
>
> That somehow doesn't work for me:

I lost power for a few hours. I am back online. I have updated the
docs at that URL. Please check and let me know.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Hannu Krosing <hannu(at)skype(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Markus Schiltknecht <markus(at)bluegap(dot)ch>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [PATCHES] replication docs: split single vs.
Date: 2006-11-17 07:27:39
Message-ID: 1163748459.2941.16.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Ühel kenal päeval, R, 2006-11-17 kell 00:01, kirjutas Bruce Momjian:
> Markus Schiltknecht wrote:
> > Not mentioning that categorization doesn't help in clearing the
> > confusion. Just look around, most people use these terms. They're used
> > by MySQL and Oracle. Even Microsofts ActiveDirectory seems to have a
> > multi-master operation mode.
>
> OK.
>
> > > For example, Slony is clearly single-master,
> >
> > Agreed.
> >
> > > but
> > > what about data partitioning? That is multi-master, in that there is
> > > more than one master, but only one master per data set.
> >
> > Data Partitioning is a way to work around the trouble of database
> > replication in the application layer. Instead of trying to categorize it
> > like a replication algorithm, we should explain that working around the
> > trouble may be worthwhile in many cases.
>
> OK. I am still feeling that data partitioning is like master/slave
> replication because you have to get that read-only copy to the other
> server. If you split things up so data sets resided on only one
> machine, you are right that would not be replication, but do people do
> that? If so, it is almost another solution.

People do that in cases where there is high write loads ("high" as in
"not 10+ times less than reads") and just replicating the RO copies
would be prohibitively expensive in either network, cpu or memory terms.

pl/proxy is one tool for doing it. You can get latest stable version
from https://developer.skype.com/SkypeGarage/DbProjects .

> > > And for
> > > multi-master, Oracle RAC is clearly multi master,
> >
> > Yes.
> >
> > > and I can see pgpool
> > > as multi-master, or as several single-master systems, in that they
> > > operate independently.
> >
> > Several single-master systems? C'mon! Pgpool simply implements the most
> > simplistic form of multi-master replication.

In what way is pgpool multimaster ? last time I looked it did nothing
but applying DML to several databses. i.e. it is not replication at all,
or at least it is masterless, unless we think of the pgpool process
itself as the _single_ master :)

> Just because you can access
> > the single databases inside the cluster doesn't make it less
> > Multi-Master, does it?
>
> OK, changed to "Multi-Master Replication Using Query Broadcasting".

I think this gives completely wrong picture of what pgpool does.

How about just "Query Broadcasting" ?

> >
> > > After much thought, it seems that putting things
> > > into single/multi-master categories just adds more confusion, because
> > > several solutions just aren't clear
> >
> > Agreed, I'm not saying you must categorize all solutions you describe.
> > But please do categorize the ones which can be (and have so often been)
> > categorized.
>
> OK.
>
> > > or fall into neither, e.g. Shared Disk Failover.
> >
> > Oh, yes, this reminds me of Brad Nicholson's suggestion in [1] to add a
> > warning "about the risk of having two postmaster come up...".
>
>
> Added.
>
> >
> > What about other means of sharing disks or filesystems? NBDs or even
> > worse: NFS?
>
> Added.
>
> >
> > > Another issue is that you mentioned heavly locking for
> > > multi-master, when in fact pgpool doesn't do any special inter-server
> > > locking, so it just doesn't apply.
> >
> > Sure it does apply, in the sense that *every* single lock is granted and
> > released on *every* node. The total amount of locks scales linearly with
> > the amount of nodes in the cluster.
>
> Uh, but the locks are the same on each machine as if it was a single
> server, while in a cluster, the locks are more intertwined with other
> things that are happening on the server, no?
>
> > > In summary, it just seemed clearer to talk about each item and how it
> > > works, rather than try to categorize them. The categorization just
> > > seems to do more harm than good.
> > >
> > > Of course, I might be totally wrong, and am still looking for feedback,
> > > but these are my current thoughts. Feedback?
> >
> > AFAICT, the categorization in Single- and Multi-Master replication is
> > very common. I think that's partly because it's focused on the solution.
> > One can ask: do I want to write on all nodes or is a failover solution
> > sufficient? Or can I probably get away with a read-only Slave?
>
> OK.
>
> > It's a categorization the user does, often before having a glimpse about
> > how complicated database replication really is. Thus, IMO, it would make
> > sense to help the user and allow him to quickly find answers. (And we
> > can still tell them that it's not easy or even possible to categorize
> > all the solutions.)
> >
> > > I didn't mention distributed shared memory as a separate item because I
> > > felt it was an implementation detail of clustering, rather than
> > > something separate. I kept two-phase in the cluster item for the same
> > > reason.
> >
> > Why is pgpool not an implementation detail of clustering, then?
> >
> > > Current version at:
> > >
> > > http://momjian.us/main/writings/pgsql/sgml/failover.html
> >
> > That somehow doesn't work for me:
>
> I lost power for a few hours. I am back online. I have updated the
> docs at that URL. Please check and let me know.
>
--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com


From: Hannu Krosing <hannu(at)skype(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [PATCHES] replication docs: split single vs.
Date: 2006-11-17 07:45:33
Message-ID: 1163749533.2941.20.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Ühel kenal päeval, R, 2006-11-17 kell 00:01, kirjutas Bruce Momjian:
> > > Current version at:
> > >
> > > http://momjian.us/main/writings/pgsql/sgml/failover.html

it refers to "Warm Standby Using Point-In-Time
Recovery" (http://momjian.us/main/writings/pgsql/sgml/warm-standby.html), maybe its a good idea to give pointers to SkyTools (description: https://developer.skype.com/SkypeGarage/DbProjects/SkyTools
code: http://pgfoundry.org/projects/skytools/ ) which includes a
walmgr.py script which sets up and manages WAL-based standby servers.

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com


From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [PATCHES] replication docs: split single vs. multi-master
Date: 2006-11-17 08:02:47
Message-ID: 455D6CA7.2090702@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hello Bruce,

You wrote:
> I am still feeling that data partitioning is like master/slave
> replication because you have to get that read-only copy to the other
> server.

Yes, that's where replication comes into play. But data partitioning per
se has nothing to do with replication, has it? You can partition your
data however you want: among tablespaces, among databases or among
multiple servers. Data partitioning solves different problems than
replication. I think it's important to keep them separate. Why do you
mix-in Slony-I in the Data Partitioning Section? One can use any other
replication solution to "get that read-only copy to the other server".

> If you split things up so data sets resided on only one
> machine, you are right that would not be replication, but do people do
> that? If so, it is almost another solution.

Yes, as I say: Data Partitioning solves another problem.

>>> And for
>>> multi-master, Oracle RAC is clearly multi master,
>> Yes.
>>
>>> and I can see pgpool
>>> as multi-master, or as several single-master systems, in that they
>>> operate independently.
>> Several single-master systems? C'mon! Pgpool simply implements the most
>> simplistic form of multi-master replication. Just because you can access
>> the single databases inside the cluster doesn't make it less
>> Multi-Master, does it?
>
> OK, changed to "Multi-Master Replication Using Query Broadcasting".

Good. That reads already better for me. ;-)

As Jim Nasby pointed out in [1], not all solutions are as simplistic as
pgpool and do not necessarily have the same disadvantages - while using
the very same algorithm: Query Broadcasting.

I suggest we make sure to clarify that and better point out some of the
aspects all Multi-Master Replication have in common (see
replication_doku_4.diff of my patches).

> Added.
>
> Added.

(the additions to "Shared Disk Failover")

Good. Short and clear. (Except perhaps: how can I find out if NFS has
full POSIX behavior? Do we have to go into more detail there? I dunno.)

> Uh, but the locks are the same on each machine as if it was a single
> server, while in a cluster, the locks are more intertwined with other
> things that are happening on the server, no?

Sure.

Maybe you are right and we should better not use the term locking there.
It seems confusing because it's not clear what a 'lock' is for some
replication systems (i.e. also Postgres-R, how do you compare it's
"amount of locks"?).

Regards

Markus


From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Hannu Krosing <hannu(at)skype(dot)net>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [PATCHES] replication docs: split single vs. multi-master
Date: 2006-11-17 08:06:35
Message-ID: 455D6D8B.6090904@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Good morning Hannu,

Hannu Krosing wrote:
> People do that in cases where there is high write loads ("high" as in
> "not 10+ times less than reads") and just replicating the RO copies
> would be prohibitively expensive in either network, cpu or memory terms.

Okay. It that case it's even less like any type of replication.

IMO, Data Partitioning is the most simple method of Load Balancing. It's
like saying: hey, if your database server is overloaded, simply split
your data over multiple servers.

Which is not always possible and can lead to other problems. Some of
which can solved by replication solutions.

> In what way is pgpool multimaster ? last time I looked it did nothing
> but applying DML to several databses. i.e. it is not replication at all,

Please give your definition of replication.

Wikipedia gives us [1]: "Replication refers to the use of redundant
resources, such as software or hardware components, to improve
reliability, fault-tolerance, or performance."

Pgpool does that by Query Broadcasting, no?

> or at least it is masterless, unless we think of the pgpool process
> itself as the _single_ master :)

Hm. That's a good point. Pgpool allows to write to only one master (the
pgpool process) but read from multiple, synchronous masters. I admit
that makes it a little hard to split into Single- or Multi-Master.

Doesn't Sequoia support multiple Query Broadcasting processes? Would it
qualify as Multi-Master *Replication*, then?

In an ideal implementation, every Master could broadcast queries to all
other masters. Thus giving a *real* Multi-Master solution. Postgres-R
(6.4) did fall back into that mode for transactions which change a lot
of tuples, so that the writeset didn't exceed a certain size limit.

> I think this gives completely wrong picture of what pgpool does.

As I see it, that's because pgpool is a very limited implementation of
Query Broadcasting. But pgpool is not the only solution implementing
that algorithm. Do we want to describe the general algorithm or pgpool here?

Regards

Markus

[1]: Wikipedia about Replication (Computer Science):
http://en.wikipedia.org/wiki/Replication_%28computer_science%29


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Hannu Krosing <hannu(at)skype(dot)net>
Cc: Markus Schiltknecht <markus(at)bluegap(dot)ch>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [PATCHES] replication docs: split single vs.
Date: 2006-11-17 13:25:47
Message-ID: 200611171325.kAHDPlQ29412@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hannu Krosing wrote:
> > OK. I am still feeling that data partitioning is like master/slave
> > replication because you have to get that read-only copy to the other
> > server. If you split things up so data sets resided on only one
> > machine, you are right that would not be replication, but do people do
> > that? If so, it is almost another solution.
>
> People do that in cases where there is high write loads ("high" as in
> "not 10+ times less than reads") and just replicating the RO copies
> would be prohibitively expensive in either network, cpu or memory terms.

OK, as Markus suggested, I have moved Data Partitioning down to the
bottom, and mentioned it as only optionally keeping a read-only copy on
each server. Is this better?

> > > Several single-master systems? C'mon! Pgpool simply implements the most
> > > simplistic form of multi-master replication.
>
> In what way is pgpool multimaster ? last time I looked it did nothing
> but applying DML to several databses. i.e. it is not replication at all,
> or at least it is masterless, unless we think of the pgpool process
> itself as the _single_ master :)

I have remove the mention of "multi-master" from query broadcast.

>
> > Just because you can access
> > > the single databases inside the cluster doesn't make it less
> > > Multi-Master, does it?
> >
> > OK, changed to "Multi-Master Replication Using Query Broadcasting".
>
> I think this gives completely wrong picture of what pgpool does.
>
> How about just "Query Broadcasting" ?
>

Done.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Hannu Krosing <hannu(at)skype(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [PATCHES] replication docs: split single vs.
Date: 2006-11-17 13:27:17
Message-ID: 200611171327.kAHDRHD29518@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hannu Krosing wrote:
> ?hel kenal p?eval, R, 2006-11-17 kell 00:01, kirjutas Bruce Momjian:
> > > > Current version at:
> > > >
> > > > http://momjian.us/main/writings/pgsql/sgml/failover.html
>
> it refers to "Warm Standby Using Point-In-Time
> Recovery" (http://momjian.us/main/writings/pgsql/sgml/warm-standby.html), maybe its a good idea to give pointers to SkyTools (description: https://developer.skype.com/SkypeGarage/DbProjects/SkyTools
> code: http://pgfoundry.org/projects/skytools/ ) which includes a
> walmgr.py script which sets up and manages WAL-based standby servers.

Isn't that functionality included in 8.2, which is what this
documentation is being included with?

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Markus Schiltknecht <markus(at)bluegap(dot)ch>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] replication docs: split single vs.
Date: 2006-11-17 13:55:21
Message-ID: 200611171355.kAHDtLT01617@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Markus Schiltknecht wrote:
> Hello Bruce,
>
> You wrote:
> > I am still feeling that data partitioning is like master/slave
> > replication because you have to get that read-only copy to the other
> > server.
>
> Yes, that's where replication comes into play. But data partitioning per
> se has nothing to do with replication, has it? You can partition your
> data however you want: among tablespaces, among databases or among
> multiple servers. Data partitioning solves different problems than
> replication. I think it's important to keep them separate. Why do you
> mix-in Slony-I in the Data Partitioning Section? One can use any other
> replication solution to "get that read-only copy to the other server".

Yes, updated.

> >>> and I can see pgpool
> >>> as multi-master, or as several single-master systems, in that they
> >>> operate independently.
> >> Several single-master systems? C'mon! Pgpool simply implements the most
> >> simplistic form of multi-master replication. Just because you can access
> >> the single databases inside the cluster doesn't make it less
> >> Multi-Master, does it?
> >
> > OK, changed to "Multi-Master Replication Using Query Broadcasting".
>
> Good. That reads already better for me. ;-)

Oops, now modified to just "Query Broadcasting".

> As Jim Nasby pointed out in [1], not all solutions are as simplistic as
> pgpool and do not necessarily have the same disadvantages - while using
> the very same algorithm: Query Broadcasting.
>
> I suggest we make sure to clarify that and better point out some of the
> aspects all Multi-Master Replication have in common (see
> replication_doku_4.diff of my patches).
>
> > Added.
> >
> > Added.
>
> (the additions to "Shared Disk Failover")
>
> Good. Short and clear. (Except perhaps: how can I find out if NFS has
> full POSIX behavior? Do we have to go into more detail there? I dunno.)

Uh, I am unclear on that myself. I think NFS3 or NSF4 is OK, but am
unsure.

> > Uh, but the locks are the same on each machine as if it was a single
> > server, while in a cluster, the locks are more intertwined with other
> > things that are happening on the server, no?
>
> Sure.
>
> Maybe you are right and we should better not use the term locking there.
> It seems confusing because it's not clear what a 'lock' is for some
> replication systems (i.e. also Postgres-R, how do you compare it's
> "amount of locks"?).

OK, locks are currently mentioned only for clustering.

URL updated:

http://momjian.us/main/writings/pgsql/sgml/failover.html

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Markus Schiltknecht <markus(at)bluegap(dot)ch>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] replication docs: split single vs.
Date: 2006-11-17 16:39:19
Message-ID: 200611171639.kAHGdJS14235@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


I have renamed the documentation section "High Availability and Load
Balancing". I think the current version takes many of your comments
below into account. Please let me know.

---------------------------------------------------------------------------

Markus Schiltknecht wrote:
> Good morning Hannu,
>
> Hannu Krosing wrote:
> > People do that in cases where there is high write loads ("high" as in
> > "not 10+ times less than reads") and just replicating the RO copies
> > would be prohibitively expensive in either network, cpu or memory terms.
>
> Okay. It that case it's even less like any type of replication.
>
> IMO, Data Partitioning is the most simple method of Load Balancing. It's
> like saying: hey, if your database server is overloaded, simply split
> your data over multiple servers.
>
> Which is not always possible and can lead to other problems. Some of
> which can solved by replication solutions.
>
> > In what way is pgpool multimaster ? last time I looked it did nothing
> > but applying DML to several databses. i.e. it is not replication at all,
>
> Please give your definition of replication.
>
> Wikipedia gives us [1]: "Replication refers to the use of redundant
> resources, such as software or hardware components, to improve
> reliability, fault-tolerance, or performance."
>
> Pgpool does that by Query Broadcasting, no?
>
> > or at least it is masterless, unless we think of the pgpool process
> > itself as the _single_ master :)
>
> Hm. That's a good point. Pgpool allows to write to only one master (the
> pgpool process) but read from multiple, synchronous masters. I admit
> that makes it a little hard to split into Single- or Multi-Master.
>
> Doesn't Sequoia support multiple Query Broadcasting processes? Would it
> qualify as Multi-Master *Replication*, then?
>
> In an ideal implementation, every Master could broadcast queries to all
> other masters. Thus giving a *real* Multi-Master solution. Postgres-R
> (6.4) did fall back into that mode for transactions which change a lot
> of tuples, so that the writeset didn't exceed a certain size limit.
>
> > I think this gives completely wrong picture of what pgpool does.
>
> As I see it, that's because pgpool is a very limited implementation of
> Query Broadcasting. But pgpool is not the only solution implementing
> that algorithm. Do we want to describe the general algorithm or pgpool here?
>
> Regards
>
> Markus
>
>
> [1]: Wikipedia about Replication (Computer Science):
> http://en.wikipedia.org/wiki/Replication_%28computer_science%29
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>
To: bruce(at)momjian(dot)us
Cc: markus(at)bluegap(dot)ch, hannu(at)skype(dot)net, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] replication docs: split single vs.
Date: 2006-11-21 06:44:12
Message-ID: 20061121.154412.112616026.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

>From high-availability.sgml:

Clustering For Parallel Query Execution

This allows multiple servers to work concurrently on a single
query. One possible way this could work is for the data to be
split among servers and for each server to execute its part of the
query and results sent to a central server to be combined and
returned to the user. There currently is no PostgreSQL open source
solution for this.

I think pgpool-II can do this.
--
Tatsuo Ishii
SRA OSS, Inc. Japan


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>
Cc: markus(at)bluegap(dot)ch, hannu(at)skype(dot)net, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] replication docs: split single vs.
Date: 2006-11-21 21:37:53
Message-ID: 200611212137.kALLbrU03719@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tatsuo Ishii wrote:
> >From high-availability.sgml:
>
> Clustering For Parallel Query Execution
>
> This allows multiple servers to work concurrently on a single
> query. One possible way this could work is for the data to be
> split among servers and for each server to execute its part of the
> query and results sent to a central server to be combined and
> returned to the user. There currently is no PostgreSQL open source
> solution for this.
>
> I think pgpool-II can do this.

Thanks, I suspected it could, added:

This allows multiple servers to work concurrently on a single
query. One possible way this could work is for the data to be
split among servers and for each server to execute its part of
the query and results sent to a central server to be combined
and returned to the user. Pgpool-II has this capability.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +