Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Christopher Browne <cbbrowne(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Farina <daniel(at)heroku(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node
Date: 2012-06-20 13:43:55
Message-ID: 201206201543.56125.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, June 20, 2012 03:02:28 PM Robert Haas wrote:
> On Wed, Jun 20, 2012 at 5:15 AM, Andres Freund <andres(at)2ndquadrant(dot)com>
wrote:
> > One bit is fine if you have only very simple replication topologies. Once
> > you think about globally distributed databases its a bit different. You
> > describe some of that below, but just to reiterate:
> > Imagine having 6 nodes, 3 on one of two continents (ABC in north america,
> > DEF in europe). You may only want to have full intercontinental
> > interconnect between two of those (say A and D). If you only have one
> > bit to represent the origin thats not going to work because you won't be
> > able discern the changes from BC on A from the changes from those
> > originating on DEF.
>
> I don't see the problem. A certainly knows via which link the LCRs
> arrived.

> So: change happens on A. A sends the change to B, C, and D. B and C
> apply the change. One bit is enough to keep them from regenerating
> new LCRs that get sent back to A. So they're fine. D also receives
> the changes (from A) and applies them, but it also does not need to
> regenerate LCRs. Instead, it can take the LCRs that it has already
> got (from A) and send those to E and F.

> Or: change happens on B. B sends the changes to A. Since A knows the
> network topology, it sends the changes to C and D. D sends them to E
> and F. Nobody except B needs to *generate* LCRs. All any other node
> needs to do is suppress *redundant* LCR generation.
>
> > Another topology which is interesting is circular replications (i.e.
> > changes get shipped A->B, B->C, C->A) which is a sensible topology if
> > you only have a low change rate and a relatively high number of nodes
> > because you don't need the full combinatorial amount of connections.
>
> I think this one is OK too. You just generate LCRs on the origin node
> and then pass them around the ring at every step. When the next hop
> would be the origin node then you're done.
>
> I think you may be imagining that A generates LCRs and sends them to
> B. B applies them, and then from the WAL just generated, it produces
> new LCRs which then get sent to C.
Yes, thats what I am proposing.

> If you do that, then, yes,
> everything that you need to disentangle various network topologies
> must be present in WAL. But what I'm saying is: don't do it like
> that. Generate the LCRs just ONCE, at the origin node, and then pass
> them around the network, applying them at every node. Then, the
> information that is needed in WAL is confined to one bit: the
> knowledge of whether or not a particular transaction is local (and
> thus LCRs should be generated) or non-local (and thus they shouldn't,
> because the origin already generated them and thus we're just handing
> them around to apply everywhere).
Sure, you can do it that way, but I don't think its a good idea. If you do it
my way you *guarantee* that when replaying changes from node B on node C you
have replayed changes from A at least as far as B has. Thats a really nice
property for MM.
You *can* get same with your solution but it starts to get complicated rather
fast. While my/our proposed solution is trivial to implement.

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-06-20 13:46:46 Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node
Previous Message Robert Haas 2012-06-20 13:42:39 Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node