Re: Fast promotion failure

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fast promotion failure
Date: 2013-05-20 19:40:14
Message-ID: 519A7C1E.2050604@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20.05.2013 22:18, Simon Riggs wrote:
> On 20 May 2013 18:47, Heikki Linnakangas<hlinnakangas(at)vmware(dot)com> wrote:
>> Not sure what the best fix would be. Perhaps change the code in
>> CreateRestartPoint() to do something like this instead:
>>
>> GetXLogReplayRecPtr(&replayTLI);
>> if (RecoveryInProgress())
>> ThisTimeLineID = replayTLI;
>
> Installing a few extra WAL files doesn't seem to be a good reason to
> mess with such an important variable as ThisTimeLineID, especially
> since its such a rare event and hardly worth optimising for.
>
> I would prefer it if we didn't set ThisTimeLineID at all in that way,
> or anywhere else. If we do, it needs to be done safely, otherwise any
> caller could make the same mistake.

> Suggested patch attached.

> @@ -7466,14 +7468,6 @@ CreateRestartPoint(int flags)
> * not be used, and will go wasted until recycled on the next
> * restartpoint. We'll live with that.
> */
> - (void) GetXLogReplayRecPtr(&ThisTimeLineID);
> -
> - RemoveOldXlogFiles(_logSegNo, endptr);
> -
> - /*
> - * Make more log segments if needed. (Do this after recycling old log
> - * segments, since that may supply some of the needed files.)
> - */
> PreallocXlogFiles(endptr);
> }

That's essentially reverting commit 343ee00b7, resurrecting the bug that
that fixed.

> @@ -9098,10 +9092,10 @@ GetXLogReplayRecPtr(TimeLineID *replayTLI)
> SpinLockAcquire(&xlogctl->info_lck);
> recptr = xlogctl->lastReplayedEndRecPtr;
> tli = xlogctl->lastReplayedTLI;
> + if (replayTLI && xlogctl->SharedRecoveryInProgress)
> + *replayTLI = tli;
> SpinLockRelease(&xlogctl->info_lck);
>
> - if (replayTLI)
> - *replayTLI = tli;
> return recptr;
> }

That breaks the only remaining caller that passes a replayTLI pointer,
GetStandbyFlushRecPtr(); *replayTLI points to a local variable in that
function, which is left uninitialized if !xlogctl->SharedRecoveryInProgress.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2013-05-20 19:43:05 Re: Heap truncation without AccessExclusiveLock (9.4)
Previous Message Alvaro Herrera 2013-05-20 19:38:50 Re: FK locking concurrency improvement