Re: PITR potentially broken in 9.2

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: PITR potentially broken in 9.2
Date: 2012-12-05 21:56:52
Message-ID: 20121205215651.GU27424@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On 2012-12-05 16:15:38 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > On 5 December 2012 18:48, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> On further thought, it seems like recovery_pause_at_target is rather
> >> misdesigned anyway, and taking recovery target parameters from
> >> recovery.conf is an obsolete API that was designed in a world before hot
> >> standby. What I suggest people really want, if they're trying to
> >> interactively determine how far to roll forward, is this:
> >> ...
>
> > Can't remember why we didn't go for the full API last time. I'll have
> > another go, in HEAD.
>
> That's fine, but the immediate question is what are we doing to fix
> the back branches. I think everyone is clear that we should be testing
> LocalHotStandbyActive rather than precursor conditions to see if a pause
> is allowed, but are we going to do anything more than that?

I'd like to have inclusive/non-inclusive stops some resemblance of
sanity.

Raw patch including your earlier LocalHotStandbyActive one attached.

> The only other thing I really wanted to do is not have the in-loop pause
> occur after we've taken actions that are effectively part of the WAL
> apply step. I think ideally it should happen just before or after the
> recoveryStopsHere stanza. Last night I worried about adding an extra
> spinlock acquire to make that work, but today I wonder if we couldn't
> get away with just a naked
>
> if (xlogctl->recoveryPause)
> recoveryPausesHere();

> The argument for this is that although we might fetch a slightly stale
> value of the shared variable, it can't be very stale --- certainly no
> older than the spinlock acquisition near the bottom of the previous
> iteration of the loop. And this is a highly asynchronous feature
> anyhow, so fuzziness of plus or minus one WAL record in when the pause
> gets honored is not going to be detectable. Hence an extra spinlock
> acquisition is not worth the cost.

Seems safe enough to me. But I am not sure its necessary, if we move
the recoveryPausesHere() down after replaying the record, fetching
inside the spinlock for recoveryLastRecPtr seems to be natural enough.

Greetings,

Andres Freund

Attachment Content-Type Size
recovery-pause.patch text/x-patch 2.4 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Robert Haas 2012-12-05 22:10:43 Re: PITR potentially broken in 9.2
Previous Message Tom Lane 2012-12-05 21:38:17 Re: PITR potentially broken in 9.2

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2012-12-05 21:59:57 Re: Dumping an Extension's Script
Previous Message Simon Riggs 2012-12-05 21:46:03 Re: Review: Extra Daemons / bgworker