Re: BUG #7710: Xid epoch is not updated properly during checkpoint

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: tarvip(at)gmail(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: BUG #7710: Xid epoch is not updated properly during checkpoint
Date: 2012-12-01 22:56:33
Message-ID: 24367.1354402593@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

tarvip(at)gmail(dot)com writes:
> [ txid_current can show a bogus value near XID wraparound ]
> This happens only if wal_level=hot_standby.

I believe what is happening here is

(1) CreateCheckPoint sets up checkPoint.nextXid and
checkPoint.nextXidEpoch, near xlog.c line 7070 in HEAD. At this point,
nextXid is still a bit less than the wrap point.

(2) After performing the checkpoint, at line 7113, CreateCheckPoint
calls LogStandbySnapshot() which "helpfully" updates checkPoint.nextXid
to the latest value. Which by now has wrapped around. But it doesn't
fix checkPoint.nextXidEpoch, so the checkpoint that gets written out has
effectively lost the epoch bump that should have happened.

While we could add some more logic to try to correct the epoch value
in this scenario, I think it's a much better idea to just stop having
LogStandbySnapshot update the nextXid. That seems to me to be useless
complication. I also quite dislike the fact that we're effectively
redefining the checkpoint nextXid from being taken before the main
body of the checkpoint to being taken afterwards, but *only* in
XLogStandbyInfoActive mode. If that inconsistency isn't already causing
bugs (besides this one) today, it'll probably cause them in the future.

So barring objections, I'm going to remove LogStandbySnapshot's behavior
of returning the updated nextXid.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jeff Janes 2012-12-01 22:59:10 Re: PITR potentially broken in 9.2
Previous Message Tom Lane 2012-12-01 21:56:44 Re: PITR potentially broken in 9.2