Cascading replication and recovery_target_timeline='latest'

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Cascading replication and recovery_target_timeline='latest'
Date: 2012-08-31 08:03:34
Message-ID: 50406FD6.8050903@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

When a cascading standby launches a new walsender, it fetches the
current recovery timeline:

/*
* Use the recovery target timeline ID during recovery
*/
if (am_cascading_walsender)
ThisTimeLineID = GetRecoveryTargetTLI();

Comment in GetRecoveryTargetTLI() does this:

/* RecoveryTargetTLI doesn't change so we need no lock to copy it */
return XLogCtl->RecoveryTargetTLI;

That comment is not true. RecoveryTargetTLI can change during recovery,
if you set recovery_target_timeline='latest'. In 'latest' mode, when the
(apparent) end of WAL is reached, the archive is scanned for any new
timeline history files that may have appeared. If a new timeline is
found, RecoveryTargetTLI is updated, and recovery is continued on the
new timeline.

Aside from the missing locking, I wonder what that does to a cascaded
standby. If there is an active walsender running while RecoveryTargetTLI
is changed, I think what will happen is that the walsender will continue
to stream WAL from the old timeline, but because the startup process is
now actually replaying from a different timeline, the walsender will
send bogus WAL to the standby.

When a standby ends recovery, creates a new timeline, and switches to
normal operation, postmaster terminates all walsenders because of the
timeline change. But don't we have a race condition there, with similar
effect? It might take a while for a walsender to die, and in that
window, it might send bogus WAL to the cascaded standby.

- Heikki

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kohei KaiGai 2012-08-31 08:09:24 Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)
Previous Message Dean Rasheed 2012-08-31 06:59:43 Re: Proof of concept: auto updatable views