Re: Synchronous Log Shipping Replication

From: Markus Wanner <markus(at)bluegap(dot)ch>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Hannu Krosing <hannu(at)krosing(dot)net>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Synchronous Log Shipping Replication
Date: 2008-09-10 08:06:14
Message-ID: 48C77FF6.3010804@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Simon Riggs wrote:
> 1. Standby contacts primary and says it would like to catch up, but is
> currently at point X (which is a point at, or after the first consistent
> stopping point in WAL after standby has performed its own crash
> recovery, if any was required).
> 2. primary initiates data transfer of old data to standby, starting at
> point X
> 3. standby tells primary where it has got to periodically
> 4. at some point primary decides primary and standby are close enough
> that it can now begin streaming "current WAL" (which is always the WAL
> up to wal_buffers behind the the current WAL insertion point).

Hm.. wouldn't it be simpler, to start streaming right away and "cache"
that on the standby until it can be applied? I.e. a protocol like:

1. - same as above -
2. primary starts streaming from live or hot data from it's current
position Y in the WAL stream, which is certainly after (or probably
equal to) X.
3. standby receives the hot stream from point Y on. It now knows it
misses 'cold' portions of the WAL from X to Y and requests that.
4. primary serves remaining 'cold' WAL chunks from its xlog / archive
from between X and Y.
5. standby applies 'cold' WAL, until done. Then proceeds with the cached
WAL segments from 'hot' streaming.

> Bear in mind that unless wal_buffers > 16MB the final catchup will
> *always* be less than one WAL file, so external file based mechanisms
> alone could never be enough.

Agreed.

> This also probably means that receipt of WAL data on the standby cannot
> be achieved by placing it in wal_buffers. So we probably need to write
> it directly to the WAL files, then rely on the filesystem cache on the
> standby to buffer the data for use by ReadRecord.

Makes sense, especially in case of cached WAL as outlined above. Is this
a problem in any way?

Regards

Markus Wanner

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2008-09-10 08:10:18 Re: Synchronous Log Shipping Replication
Previous Message Csaba Nagy 2008-09-10 08:04:53 Re: Synchronous Log Shipping Replication