Re: Cascade replication

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Cascade replication
Date: 2011-07-05 11:08:11
Message-ID: CA+U5nMJRCfU-crF68MR+O_37uTXPz8NHFGtGV8ofS2=btdi6Og@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 5, 2011 at 4:34 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Mon, Jul 4, 2011 at 6:24 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> On Tue, Jun 14, 2011 at 6:08 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>
>>>> The standby must not accept replication connection from that standby itself.
>>>> Otherwise, since any new WAL data would not appear in that standby,
>>>> replication cannot advance any more. As a safeguard against this, I introduced
>>>> new ID to identify each instance. The walsender sends that ID as the fourth
>>>> field of the reply of IDENTIFY_SYSTEM, and then walreceiver checks whether
>>>> the IDs are the same between two servers. If they are the same, which means
>>>> that the standby is just connecting to that standby itself, so walreceiver
>>>> emits ERROR.
>>
>> Thanks for waiting for review.
>
> Thanks for the review!

> I agree to focus on the main problem first. I removed that. Attached
> is the updated version.

Now for the rest of the review...

I'd rather not include another chunk of code related to
wal_keep_segments. The existing code in CreateCheckPoint() should be
refactored so that we call the same code from both CreateCheckPoint()
and CreateRestartPoint().

IMHO it's time to get rid of RECOVERYXLOG as an initial target for
de-archived files. That made sense once, but now we have streaming it
makes more sense for us to de-archive straight onto the correct file
name and let the file be cleaned up later. So de-archiving it and then
copying to the new location doesn't seem the right thing to do
(especially not to copy rather than rename). RECOVERYXLOG allowed us
to de-archive the file without removing a pre-existing file, so we
must handle that still - the current patch would fail if a
pre-existing WAL file were there.

Those changes will make this code cleaner for the long term.

I don't think we should simply shutdown a WALSender when we startup.
That is indistinguishable from a failure, which is going to be very
worrying if we do a switchover. Is there another way to do this? Or if
not, at least a log message to explain it was normal that we requested
this.

It would be possible to have synchronous cascaded replication but that
is probably another patch :-)

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yeb Havinga 2011-07-05 11:15:44 Re: Parameterized aggregate subquery (was: Pull up aggregate subquery)
Previous Message Fujii Masao 2011-07-05 10:55:50 Re: Inconsistency between postgresql.conf and docs