Re: Keepalive for max_standby_delay

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Keepalive for max_standby_delay
Date: 2010-06-01 10:36:59
Message-ID: 4C04E2CB.90304@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27/05/10 20:26, Simon Riggs wrote:
> On Wed, 2010-05-26 at 16:22 -0700, Josh Berkus wrote:
>>> Just this second posted about that, as it turns out.
>>>
>>> I have a v3 *almost* ready of the keepalive patch. It still makes sense
>>> to me after a few days reflection, so is worth discussion and review. In
>>> or out, I want this settled within a week. Definitely need some R&R
>>> here.
>>
>> Does the keepalive fix all the issues with max_standby_delay? Tom?
>
> OK, here's v4.
>
> Summary
>
> * WALSender adds a timestamp onto the header of every WAL chunk sent.
>
> * Each WAL record now has a conceptual "send timestamp" that remains
> constant while that record is replayed. This is used as the basis from
> which max_standby_delay is calculated when required during replay.
>
> * Send timestamp is calculated as the later of the timestamp of chunk in
> which WAL record was sent and the latest XLog time.
>
> * WALSender sends an empty message as a keepalive when nothing else to
> send. (No longer a special message type for the keepalive).
>
> I think its close, but if there's a gaping hole here somewhere then I'll
> punt for this release.

This certainly alleviates some of the problems. You still need to ensure
that master and standby have synchronized clocks, and you still get zero
grace time after a long period of inactivity when not using streaming
replication, however.

Sending a keep-alive message every 100ms seems overly aggressive to me.

If we really want to try to salvage max_standby_delay with a meaning
similar to what it has now, I think we should go with the idea some
people bashed around earlier and define the grace period as the
difference between a WAL record becoming available to the standby for
replay, and between replaying it. An approximation of that is to do
"lastIdle = gettimeofday()" in XLogPageRead() whenever it needs to wait
for new WAL to arrive, whether that's via streaming replication or by a
success return code from restore_command, and compare the difference of
that with current timestamp in WaitExceedsMaxStandbyDelay().

That's very simple, doesn't require synchronized clocks, and works the
same with file- and stream-based setups.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2010-06-01 10:53:29 Re: [RFC] A tackle to the leaky VIEWs for RLS
Previous Message KaiGai Kohei 2010-06-01 10:04:54 Re: [RFC] A tackle to the leaky VIEWs for RLS