Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Date: 2014-07-20 21:22:48
Message-ID: 15685.1405891368@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2014-07-20 16:16:18 -0400, Tom Lane wrote:
>> This one is not about the extra 0000 file. It's about whether datminmxid
>> and relminmxid are wrong. In the previous coding of pg_upgrade, they'd
>> have been left at "1" even if that value has wrapped around into the
>> future.

> Yea, I'm rereading the thread atm. I'd stopped following it after a
> while and didn't notice the drift into a second problem.

I've been doing more analysis on this. It looks to me like most people
would not notice a problem, but ...

1. When updating from a pre-9.3 version, pg_upgrade sets pg_control's
oldestMultiXid to equal the old cluster's NextMultiXactId (with a useless
increment of the new NextMultiXactId, but that's not too relevant).
This might seem silly, because there are very probably mxids out there
that are before this value, but the multixact code is designed to assume
that LOCKED_ONLY mxids before oldestMultiXid are not running --- without
ever going to pg_multixact to check. So in fact, there will be no
attempts to access on-disk data about old mxids; at least as long as the
mxid counters haven't wrapped around.

2. However, pg_upgrade also sets datminmxid/relminmxid to equal the old
cluster's NextMultiXactId. The trouble with this is that it might fool
(auto)vacuum into never seeing and freezing the pre-upgrade mxids; they're
in the table but the metadata says not, so we'd not force a full table
scan to find them. If such a mxid manages to survive past wraparound,
it'll be "in the future" and the multixact code will start complaining
about it, like this:

if (!MultiXactIdPrecedes(multi, nextMXact))
ereport(ERROR,
(errcode(ERRCODE_INTERNAL_ERROR),
errmsg("MultiXactId %u has not been created yet -- apparent wraparound",
multi)));

3. In practice, because full-table vacuum scans get forced periodically
anyway to advance relminxid, it's entirely likely that old mxids would get
frozen before there was a problem. vacuum will freeze an old mxid on
sight, whatever the reason for visiting it.

4. The patch Bruce applied to initialize datminmxid/relminmxid to the old
NextMultiXactId rather than 1 does not fundamentally change anything here.
It narrows the window in which wraparound can cause problems, but only by
the distance that "1" is in-the-future at the time of upgrade. Indeed,
you could argue that it makes things worse, because it *guarantees* that
vacuum is unaware of old mxids existing in the table, whereas with the "1"
you had no such problem unless you were more than halfway to the wrap point.

What we've effectively got ATM, with either the patched or unpatched code,
is that you're safe to the extent that relminxid-driven freezing happens
often enough to get rid of mxids before they wrap. I note that that's
exactly the situation we had pre-9.3. That being the case, and in view
of the lack of complaints from the field about it, I wonder whether we
aren't greatly overreacting. If the alternative is that pg_upgrade forces
cluster-wide freezing activity to occur immediately upon system startup,
I'd definitely say that the cure is worse than the disease.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2014-07-20 21:32:58 Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Previous Message Andres Freund 2014-07-20 21:04:08 Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts