Re: BUG #7710: Xid epoch is not updated properly during checkpoint

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: tarvip(at)gmail(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Daniel Farina <daniel(at)heroku(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: BUG #7710: Xid epoch is not updated properly during checkpoint
Date: 2012-12-01 22:59:21
Message-ID: 20121201225921.GA25134@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi all,

On 2012-11-27 19:52:09 +0000, tarvip(at)gmail(dot)com wrote:
> This happens only if wal_level=hot_standby.
>
>
> Here are the steps to reproduce this issue.

Oh. Fucking. Wow.

I think this tiny comment just helped me find the bug. After previously
having looked for it without success for some time I just recognized the
problem while working on something unrelated.

Here it goes:

void
CreateCheckPoint(int flags)
{
...
/*
* Get the other info we need for the checkpoint record.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
checkPoint.nextXid = ShmemVariableCache->nextXid;
checkPoint.oldestXid = ShmemVariableCache->oldestXid;
checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB;
LWLockRelease(XidGenLock);

/* Increase XID epoch if we've wrapped around since last checkpoint */
checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
checkPoint.nextXidEpoch++;
// i.e. compute the epoch based on the *current* nextXid.

...
// do all the writing, take some time
CheckPointGuts(checkPoint.redo, flags);
...
* Update checkPoint.nextXid since we have a later value
*/
if (!shutdown && XLogStandbyInfoActive())
LogStandbySnapshot(&checkPoint.nextXid);
...
/* Update shared-memory copy of checkpoint XID/epoch */
{
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;

SpinLockAcquire(&xlogctl->info_lck);
xlogctl->ckptXidEpoch = checkPoint.nextXidEpoch;
xlogctl->ckptXid = checkPoint.nextXid;
SpinLockRelease(&xlogctl->info_lck);
}
...
}

Notice the end of the comment above/about LogStandbySnapshot. It updates
nextXid! But it does *not* recheck whether we have had a wraparound +
so it does not increase the epoch counter. Although we might have had a
wraparound inbetween.

Which in turn means the the ShmemVariableCache->nextXid <
xlogctl->ckptXid computation in GetNextXidAndEpoch doesn't return
correct results anymore because it can't recognize that were in a new
epoch now.

Trivial patch attached.

I am not sure that I understand why its interesting to have a newer
->nextXid in the checkpoint, but I don't see anything that would
make it dangerous besides this.

This looks like it also explains #6291 and the slighly different issue
described in CAAZKuFbB7UR3NXV1pkZFRXy=6V1QBq_OeHJWJNTLpKBpH=QFgg(at)mail(dot)gmail(dot)com,
as the issue is just as present on standbys as it is on primaries.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Fix-xid-epoch-calculation-with-wal_level-hot_standby.patch text/x-patch 2.0 KB

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2012-12-01 23:10:22 Re: BUG #7710: Xid epoch is not updated properly during checkpoint
Previous Message Jeff Janes 2012-12-01 22:59:10 Re: PITR potentially broken in 9.2