Re: Cascading replication and recovery_target_timeline='latest'

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: hlinnaka(at)iki(dot)fi
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Cascading replication and recovery_target_timeline='latest'
Date: 2012-09-05 00:14:11
Message-ID: 50469953.1070603@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 03.09.2012 17:40, Heikki Linnakangas wrote:
> On 03.09.2012 16:26, Heikki Linnakangas wrote:
>> On 03.09.2012 16:25, Fujii Masao wrote:
>>> On Tue, Sep 4, 2012 at 7:07 AM, Heikki Linnakangas<hlinnaka(at)iki(dot)fi>
>>> wrote:
>>>> Hmm, I was thinking that when walsender gets the position it can send
>>>> the
>>>> WAL up to, in GetStandbyFlushRecPtr(), it could atomically check the
>>>> current
>>>> recovery timeline. If it has changed, refuse to send the new WAL and
>>>> terminate. That would be a fairly small change, it would just close the
>>>> window between requesting walsenders to terminate and them actually
>>>> terminating.
>>>
>>> Yeah, sounds good. Could you implement the patch? If you don't have
>>> time,
>>> I will....
>>
>> I'll give it a shot..
>
> So, this is what I came up with, please review.

While testing, I bumped into another related bug: When a WAL segment is
restored from the archive, we let a walsender to send that whole WAL
segment to a cascading standby. However, there's no guarantee that the
restored WAL segment is complete. In particular, if a timeline changes
within that segment, e.g 000000010000000000000004, that segment will be
only partially full, and the WAL continues at segment
000000020000000000000004, at the next timeline. This can also happen if
you copy a partial WAL segment to the archive, for example from a
crashed master server. Or if you have set up record-based WAL shipping
not using streaming replication, per
http://www.postgresql.org/docs/devel/static/log-shipping-alternative.html#WARM-STANDBY-RECORD.
That manual page says you can only deal with whole WAL files that way,
but I think with standby_mode='on', that's actually no longer true.

So all in all, it seems like a shaky assumption that once you've
restored a WAL file from the archive, you're free to stream it to a
cascading slave. I think it would be more robust to limit it to
streaming the file only up to the point that it's been replayed - and
thus verified - in the 1st standby. If everyone is OK with that change
in behavior, the fix is simple.

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2012-09-05 00:34:59 Re: Cascading replication and recovery_target_timeline='latest'
Previous Message Tom Lane 2012-09-04 23:50:16 Re: Cascading replication and recovery_target_timeline='latest'