Re: Re: Synch Rep: direct transfer of WAL file from the primary to the standby

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Synch Rep: direct transfer of WAL file from the primary to the standby
Date: 2009-07-07 18:21:51
Message-ID: 10519.1246990911@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> On Tue, Jul 7, 2009 at 4:49 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> This design seems totally wrong to me.
>> ...

> But this conflicts with earlier discussions where we were concerned
> about the length of the path wal has to travel between the master and
> the slaves. We want slaves to be able to be turned on simply using a
> simple robust configuration and to be able to respond quickly to
> transactions that are committed in the master for synchronous
> operation.

Well, the problem I've really got with this is that if you want sync
replication, couching it in terms of WAL files in the first place seems
like getting off on fundamentally the wrong foot. That still leaves you
with all the BS about having to force WAL file switches (and eat LSN
space) for all sorts of undesirable reasons. I think we want the
API to operate more like a WAL stream. I would envision the slaves
connecting to the master's replication port and asking "feed me WAL
beginning at LSN position thus-and-so", with no notion of WAL file
boundaries exposed anyplace. The point about not wanting to archive
lots of WAL on the master would imply that the master reserves the right
to fail if the requested starting position is too old, whereupon the
slave needs some way to resync --- but that probably involves something
close to taking a fresh base backup to copy to the slave. You either
have the master not recycle its WAL while the backup is going on (so the
slave can start reading afterwards), or expect the slave to absorb and
buffer the WAL stream while the backup is going on. In neither case is
there any reason to have an API that involves fetching arbitrary chunks
of past WAL, and certainly not one that is phrased as fetching specific
WAL segment files.

There are still some interesting questions in this about exactly how you
switch over from "catchup mode" to following the live WAL broadcast.
With the above design it would be the master's responsibility to manage
that, since presumably the requested start position will almost always
be somewhat behind the live end of WAL. It might be nicer to push that
complexity to the slave side, but then you do need two data paths
somehow (ie, retrieving the slightly-stale WAL is separated from
tracking live events). Which is what you're saying we should avoid,
and I do see the point there.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2009-07-07 18:38:29 WIP: Deferrable unique constraints
Previous Message Greg Stark 2009-07-07 17:57:23 Re: WIP: generalized index constraints