Re: Proposal for a cascaded master-slave replication system

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: PostgreSQL General <pgsql-general(at)postgresql(dot)org>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for a cascaded master-slave replication system
Date: 2003-11-11 19:54:27
Message-ID: 3FB13E73.8090404@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Joe Conway wrote:

> Jan Wieck wrote:
>> http://developer.postgresql.org/~wieck/slony1.html
>
> Very interesting read. Nice work!
>
>> We want to build this system as a community project. The plan was from
>> the beginning to release the product under the BSD license. And we think
>> it is best to start it as such and to ask for suggestions during the
>> design phase already.
>
> I couldn't quite tell from the design doc -- do you intend to support
> conditional replication at a row level?

If you mean to configure the system to replicate rows to different
destinations (slaves) based on arbitrary qualifications, no. I had
thought about it, but it does not really fit into the "datacenter and
failover" picture, so it is not required to meet the goals and adds
unnecessary complexity.

This sort of feature is much more important for a replication system
designed for hundreds or thousands of sporadic, asynchronous
multi-master systems, the typical "salesman on the street" kind of
replication.

>
> I'm also curious, with cascaded replication, how do you handle the case
> where a second level slave has a transaction failure for some reason, i.e.:
>
> M
> / \
> / \
> Sa Sb
> / \ / \
> Sc Sd Se Sf
>
> What happens if data is successfully replicated to Sa, Sb, Sc, and Sd,
> and then an exception/rollback occurs on Se?

First, it does not replicate single transactions. It replicates batches
of them together. Since the transactions are already committed (and
possibly some other depending on them too), there is no way - you loose Se.

If this is only a temporary failure, like a power fail and the database
recovers on restart fine including the last confirmed SYNC event (they
get confirmed after they commit locally, but that's before the next
checkpoint so there is actually a gap where the slave could loose a
committed transaction and then it's lost for sure) ... so if it comes
back up without loosing the last confirmed SYNC, it will catch up.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Richard Huxton 2003-11-11 20:13:02 Re: update slow
Previous Message Josué Maldonado 2003-11-11 19:46:48 update slow

Browse pgsql-hackers by date

  From Date Subject
Next Message Rod Taylor 2003-11-11 19:57:37 Re: About the partial tarballs
Previous Message Marc G. Fournier 2003-11-11 19:49:45 Re: About the partial tarballs