Re: proposal: multiple read-write masters in a cluster with wal-streaming synchronization

From: Mark Dilger <markdilger(at)yahoo(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: multiple read-write masters in a cluster with wal-streaming synchronization
Date: 2013-12-31 21:51:08
Message-ID: 1388526668.53184.YahooMailNeo@web125403.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The BDR documentation http://wiki.postgresql.org/images/7/75/BDR_Presentation_PGCon2012.pdf
says,

    "Physical replication forces us to use just one
     node: multi-master required for write scalability"

    "Physical replication provides best read scalability"

I am inclined to agree with the second statement, but
I think my proposal invalidates the first statement, at
least for a particular rigorous partitioning over which
server owns which data.

In my own workflow, I load lots of data from different
sources.  The partition the data loads into depends on
which source it came from, and it is never mixed or
cross referenced in any operation that writes the data.
It is only "mixed" in the sense that applications query
data from multiple sources.

So for me, multi-master with physical replication seems
possible, and would presumably provide the best
read scalability.  I doubt that I am in the only database
user who has this kind of workflow.

The alternatives are ugly.  I can load data from separate
sources into separate database servers *without* replication
between them, but then the application layer has to
emulate queries across the data.  (Yuck.)  Or I can use
logical replication such as BDR, but then the servers
are spending more effort than with physical replication,
so I get less bang for the buck when I purchase more
servers to add to the cluster.  Or I can use FDW to access
data from other servers, but that means the same data
may be pulled across the wire arbitrarily many times, with
corresponding impact on the bandwidth.

Am I missing something here?  Does BDR really provide
an equivalent solution?

Second, it seems that BDR leaves to the client the responsibility
for making schemas the same everywhere.  Perhaps this is just
a limitation of the implementation so far, which will be resolved
in the future?

On Tuesday, December 31, 2013 12:33 PM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:

Mark Dilger wrote:

> This is not entirely "pie in the sky", but feel free to tell me why this is crazy.

Have you seen http://wiki.postgresql.org/wiki/BDR ?

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joseph Kregloh 2013-12-31 21:51:22 Re: pg_upgrade & tablespaces
Previous Message Adrian Klaver 2013-12-31 21:39:08 Re: pg_upgrade & tablespaces