Quick Links

Re: MultiXact truncation, startup et al.

Lists:	pgsql-hackers

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	MultiXact truncation, startup et al.
Date:	2013-11-21 19:38:47
Message-ID:	20131121193847.GL7240@alap2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

Heikki's comments about StartupMultiXact() being executed too late in
http://archives.postgresql.org/message-id/528C9392.8000004%40vmware.com
made me look at how multixact truncation works, since that's where
SimpleLruTruncate() would trigger an error because of an unitialized
latest_page_number.

Turns out, we don't ever truncate pg_multixact during recovery since
9dc842f0832fd71eda826349a0c17ecf8ae93b84 because multixact truncations,
in contrast to clog, aren't WAL logged themselves. Disabling probably
was fair game back then since it wasn't too likely to remain in crash
recovery forever.
But at the very least since the addition of Hot Standby that's really
not the case anymore. If I calculate correctly currently you'd end up
with ~34GB(<9.3)/38GB of pg_multixact which seems a bit much.

I am not 100% sure, but it looks like things could actually continue to
work despite having an slru wraparound into existing data. But that's
certainly nothing I'd want to rely on and looks mostly like lucky
happenstance, especially in 9.3.

If this were a master only issue, I'd say WAL-logging mxact truncation
would be the way to go, but we can't really do that in the back branches
since multixact_redo() would throw a fit if we were to introduce a new
type of wal record and somebody would upgrade a primary first.

So, what I think we need to do is to split StartupMultiXact() into two
parts, StartupMultiXact() which only sets the offset's, members's
shared->latest_page_number and TrimMultiXact() which does the remainder
of the work, executed when finishing crash recovery at the current
location of StartupMultiXact().

At this point <9.3 and 9.3+ diverge:

<9.3 we can just re-enable doing TruncateMultiXact() in
CheckPointMultiXact() since the content of the mxacts isn't important
anymore - we just need the WAL records to be sure to have advanced the
nextMXact, nextOffset counters to be sure they are big enough.

>= 9.3 multixact checkpoints don't actually perform the truncation
anymore, but vacuum does, via vac_truncate_clog(). Which obviously isn't
executed during recovery. So we need to re-add truncation there,
possibly only during recovery (everything else should be superflous
work)?

Any comments?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-27 22:42:13
Message-ID:	20131127224213.GK31748@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

On 2013-11-21 20:38:47 +0100, Andres Freund wrote:
> Turns out, we don't ever truncate pg_multixact during recovery since
> 9dc842f0832fd71eda826349a0c17ecf8ae93b84 because multixact truncations,
> in contrast to clog, aren't WAL logged themselves. Disabling probably
> was fair game back then since it wasn't too likely to remain in crash
> recovery forever.
> But at the very least since the addition of Hot Standby that's really
> not the case anymore. If I calculate correctly currently you'd end up
> with ~34GB(<9.3)/38GB of pg_multixact which seems a bit much.
>
> I am not 100% sure, but it looks like things could actually continue to
> work despite having an slru wraparound into existing data. But that's
> certainly nothing I'd want to rely on and looks mostly like lucky
> happenstance, especially in 9.3.
>
> If this were a master only issue, I'd say WAL-logging mxact truncation
> would be the way to go, but we can't really do that in the back branches
> since multixact_redo() would throw a fit if we were to introduce a new
> type of wal record and somebody would upgrade a primary first.
>
> So, what I think we need to do is to split StartupMultiXact() into two
> parts, StartupMultiXact() which only sets the offset's, members's
> shared->latest_page_number and TrimMultiXact() which does the remainder
> of the work, executed when finishing crash recovery at the current
> location of StartupMultiXact().

So, I've done this for 9.3+ for now. Testing around that turned up that
our current way to schedule anti mxid wraparounds doesn't really work:
1) autovacuum.c knows about such vacuums, but vacuum.c doesn't. Leading
to a long cycle of partial vacuums that don't increase relminmxid.
2) Parts of the code used 200mio as a hardcoded constant, others used
autovacuum_freeze_max_age.

0001 fixes the vacuum scheduling and is applicable to 9.3+,

0002 re-adds pg_multixact truncation during crash recovery. The current
code will only work on 9.3+, but if it's deemed acceptable I can
backport it to earlier versions. I am not sure if it's worth backporting
it 9.0 given it has neither HS nor SR?

0003 is a debugging only patch adding the useful pg_burn_multixact(num)
function to pageinspect (plus some core changes to make that fast) and
allows for low autovacuum_freeze_max_age settings.

Not sure if it's really worth adding MultiXactIdPrecedesOrEquals in
0002, but I didn't want to differ in the scan_all logic normal xids ids
and mxids. I think it'd also be fine to change the logic for xids to use
TransactionIdPrecedes(), but I didn't want to touch that logic
unnecessarily.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment	Content-Type	Size
0001-Fix-various-problems-around-multixact-vacuuming.patch	text/x-patch	8.3 KB
0002-Truncate-pg_multixact-s-contents-during-crash-recove.patch	text/x-patch	7.5 KB
0003-Debugging-tool-for-burning-mxact-ids.patch	text/x-patch	5.5 KB

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Andres Freund <andres(at)2ndquadrant(dot)com>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-29 00:23:29
Message-ID:	CA+TgmoZ-i4FKCESesrdiUtO71ZPd0MjQCU17eGpWpZ8cSBz7tQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Nov 27, 2013 at 5:42 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> So, I've done this for 9.3+ for now. Testing around that turned up that
> our current way to schedule anti mxid wraparounds doesn't really work:
> 1) autovacuum.c knows about such vacuums, but vacuum.c doesn't. Leading
> to a long cycle of partial vacuums that don't increase relminmxid.
> 2) Parts of the code used 200mio as a hardcoded constant, others used
> autovacuum_freeze_max_age.
>
> 0001 fixes the vacuum scheduling and is applicable to 9.3+,

multiTableLimit is a bad name for whatever concept this is supposed to
represent. It does not involve multiple tables.
vacuum_set_xid_limits now has multiXactCutoff, multiTableLimit, and
mxLimit, and there's no explanation of what the are.

> 0002 re-adds pg_multixact truncation during crash recovery. The current
> code will only work on 9.3+, but if it's deemed acceptable I can
> backport it to earlier versions. I am not sure if it's worth backporting
> it 9.0 given it has neither HS nor SR?

Huh?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-29 00:40:19
Message-ID:	20131129004019.GA5645@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2013-11-28 19:23:29 -0500, Robert Haas wrote:
> On Wed, Nov 27, 2013 at 5:42 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > So, I've done this for 9.3+ for now. Testing around that turned up that
> > our current way to schedule anti mxid wraparounds doesn't really work:
> > 1) autovacuum.c knows about such vacuums, but vacuum.c doesn't. Leading
> > to a long cycle of partial vacuums that don't increase relminmxid.
> > 2) Parts of the code used 200mio as a hardcoded constant, others used
> > autovacuum_freeze_max_age.
> >
> > 0001 fixes the vacuum scheduling and is applicable to 9.3+,
>
> multiTableLimit is a bad name for whatever concept this is supposed to
> represent. It does not involve multiple tables.

It's the freezeTableLimit equivalent for multixacts. I haven't named it
multiFreezeTableLimit because multixacts don't actually have a concept
of freezing - and in that context I didn't see much danger for it to be
understood to apply to multiple tables.

I'll try to think of a better name.

> vacuum_set_xid_limits now has multiXactCutoff, multiTableLimit, and
> mxLimit, and there's no explanation of what the are.

Well, neither have the plain xid variants. Not that that's good, but
it's not this patches fault.
I'd be happy to provide a separate patch that adds documentation for
each of them - I've wondered often enough about their meaning to make it
rather worthwhile.

> > 0002 re-adds pg_multixact truncation during crash recovery. The current
> > code will only work on 9.3+, but if it's deemed acceptable I can
> > backport it to earlier versions. I am not sure if it's worth backporting
> > it 9.0 given it has neither HS nor SR?
>
> Huh?

That should have been a < 9.0 aka. 8.4.

Patch 02 has changed its shape slightly since the version I posted here,
because it's a prerequisite for the fix for the multixact bugs around
heap_freeze_tuple() and heap_tuple_needs_freeze() I've written about
nearby. I think Alvaro plans to work over my fixes to make them look
nice enough and commit them soonish.
Should somebody want to look into it
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/multixact-handling
contains the fixes.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Andres Freund <andres(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-29 04:00:35
Message-ID:	20131129040035.GK5513@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andres Freund escribió:

> Patch 02 has changed its shape slightly since the version I posted here,
> because it's a prerequisite for the fix for the multixact bugs around
> heap_freeze_tuple() and heap_tuple_needs_freeze() I've written about
> nearby. I think Alvaro plans to work over my fixes to make them look
> nice enough and commit them soonish.
> Should somebody want to look into it
> http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/multixact-handling
> contains the fixes.

Here's a tweaked version of the freeze patch.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment	Content-Type	Size
freezing.patch	text/x-diff	8.1 KB

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-29 08:09:54
Message-ID:	20131129080954.GB5645@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

Thanks, looks saner than my version ;)

On 2013-11-29 01:00:35 -0300, Alvaro Herrera wrote:
> ! /*
> ! * FIXME this calls TransactionIdDidAbort() internally,
> ! * falsifying the claim in the comment at the top ...
> ! */
> ! update_xid = HeapTupleGetUpdateXid(tuple);

Yea. I think we should remove that behaviour from
HeapTupleGetUpdateXid() - we don't need to rely on it here.

> ! /*
> ! * XXX we rely here on HeapTupleGetUpdateXid returning
> ! * Invalid for aborted updates.
> ! */
> ! if (!TransactionIdIsValid(update_xid))
> ! freeze_xmax = true;

I've just added this case because HeapTupleGetUpdateXid() currently can
return InvalidTransactionId - I'd be perfectly fine to just place the
aborted xid in there.

> ! * FIXME -- what if it's a committed update which has recent
> ! * new locker transaction? The tuple wouldn't have been
> ! * removed in that case (HeapTupleSatisfiesVacuum returns
> ! * RECENTLY_DEAD).
> ! */

Afaics we should be protected against that by virtue of GetOldestMultiXactId().

> ***************
> *** 5645,5668 **** heap_tuple_needs_freeze(HeapTupleHeader tuple, TransactionId cutoff_xid,
> ! /* we don't care about lockers */
> ! if (members[i].status != MultiXactStatusNoKeyUpdate ||
> ! members[i].status == MultiXactStatusUpdate)
> ! continue;

Dangerous typo alert. Should also be replaced by
ISUPDATE_from_mxstatus(). Alternatively, if we decide to make
HeapTupleGetUpdateXid() not make that DidAbort() check, we can simply
get rid of doing the loop ourselves alltogethers.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Andres Freund <andres(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-29 15:49:32
Message-ID:	20131129154932.GN5513@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andres Freund escribió:
> Hi,
>
> Thanks, looks saner than my version ;)

Here are my versions of these patches. One thing not yet addressed in
these patches is the change to HeapTupleGetUpdateXid() that has been
mentioned repeatedly in these threads (we'll work on that next). Note I
still have the FIXME comments in the freeze patch previously mentioned.

I added some comments about the output parameters of
vacuum_set_xid_limits, per Robert's complaint. Please give that a look.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment	Content-Type	Size
0001-Replace-hardcoded-200000000-with-autovacuum_freeze_m.patch	text/x-diff	2.1 KB
0002-Fix-full-table-vacuum-request-mechanism-for-MultiXac.patch	text/x-diff	10.0 KB
0003-Truncate-pg_multixact-s-contents-during-crash-recove.patch	text/x-diff	8.0 KB
0004-Fix-a-couple-of-bugs-in-MultiXactId-freezing.patch	text/x-diff	9.6 KB

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-29 15:59:42
Message-ID:	20131129155942.GA20057@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2013-11-29 12:49:32 -0300, Alvaro Herrera wrote:
> + * - xidFullScanLimit (also known as the table freeze age) represents the
> + * minimum Xid age past which a vacuum will be turned into a full-table one,
> + * to freeze tuples across the whole table. Vacuuming a table younger than
> + * this can use a partial scan.

Imo "age" isn't really appropriate, since it's a concrete xid that's the
cutoff. It's determined by vacuum_freeze_table_age, sure, but at that
point it's an absolute value.

> + * - mxactFullScanLimit (as xidFullScanLimit) represents the minimum MultiXact
> + * age past which a vacuum will be turned into a full-table one, as with
> + * xidFullScanLimit.

Not an age again.

> + scan_all |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
> + mxactFullScanLimit);

Hah. That's cute ;).
> @@ -1906,6 +1931,7 @@ CheckPointMultiXact(void)
> SimpleLruFlush(MultiXactOffsetCtl, true);
> SimpleLruFlush(MultiXactMemberCtl, true);
>
> +
> TRACE_POSTGRESQL_MULTIXACT_CHECKPOINT_DONE(true);
> }

My fault, but superflous newline added.

> @@ -8619,6 +8623,22 @@ CreateRestartPoint(int flags)
> }
> LWLockRelease(ControlFileLock);
>
> +

Also an inconsistent newline, again by me :(

> - multi = HeapTupleHeaderGetRawXmax(tuple);
> - if (MultiXactIdPrecedes(multi, cutoff_multi))
> - return true;
> + nmembers = GetMultiXactIdMembers(xid, &members, true);
> + for (i = 0; i < nmembers; i++)
> + {
> + TransactionId member = members[i].xid;
> + Assert(TransactionIdIsNormal(member));
> +
> + /* we don't care about lockers */
> + if (ISUPDATE_from_mxstatus(members[i].status))
> + continue;

Isn't there a ! missing?

> diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
> index c389bf3..518c22d 100644
> --- a/src/backend/access/transam/multixact.c
> +++ b/src/backend/access/transam/multixact.c
> @@ -445,6 +445,10 @@ MultiXactIdExpand(MultiXactId multi, TransactionId xid, MultiXactStatus status)
>
> for (i = 0, j = 0; i < nmembers; i++)
> {
> + /*
> + * FIXME: is it possible that we copy over too old updater xids
> + * here?
> + */
> if (TransactionIdIsInProgress(members[i].xid) ||
> ((members[i].status > MultiXactStatusForUpdate) &&
> TransactionIdDidCommit(members[i].xid)))

That's not really a new thing though, so I am fine with leaving that as
is for now.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Andres Freund <andres(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-29 19:30:08
Message-ID:	20131129193008.GP5513@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

New versions of all these patches, plus one more patch which removes the
behavior that HeapTupleGetUpdateXid checks for aborted updates. (Turns
out this was necessary to get freezing right.) My previous patch to
avoid InvalidXid as page prune point is reverted in there, too (no
longer necessary.)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment	Content-Type	Size
0001-Replace-hardcoded-200000000-with-autovacuum_freeze_m.patch	text/x-diff	2.1 KB
0002-Fix-full-table-vacuum-request-mechanism-for-MultiXac.patch	text/x-diff	10.0 KB
0003-Truncate-pg_multixact-s-contents-during-crash-recove.patch	text/x-diff	8.1 KB
0004-Don-t-TransactionIdDidAbort-in-HeapTupleGetUpdateXid.patch	text/x-diff	10.9 KB
0005-Fix-a-couple-of-bugs-in-MultiXactId-freezing.patch	text/x-diff	9.7 KB

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-29 19:58:06
Message-ID:	20131129195806.GA6237@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2013-11-29 16:30:08 -0300, Alvaro Herrera wrote:
> New versions of all these patches, plus one more patch which removes the
> behavior that HeapTupleGetUpdateXid checks for aborted updates.

> From 0dce0b75da2732ca93f4c451b9bae6d4416794c3 Mon Sep 17 00:00:00 2001
> From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
> Date: Fri, 29 Nov 2013 16:08:06 -0300
> Subject: [PATCH 4/5] Don't TransactionIdDidAbort in HeapTupleGetUpdateXid
>
> It is dangerous to do so, because some code expects to be able to see what's
> the true Xmax even if it is aborted (particularly while traversing HOT
> chains). So don't do it, and instead rely on the callers to verify for
> abortedness, if necessary.
>
> Several race conditions and bugs fixed in the process.

The current version of that additional patch causes two failures in
isolationtester's delete-abort-savept.spec/out. But afaics the old
behaviour was a bug: An updater seems to have waited for an aborted
subtransaction.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-29 20:45:29
Message-ID:	20131129204529.GA14712@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2013-11-29 16:30:08 -0300, Alvaro Herrera wrote:
> As a second bug, heap_freeze_tuple() didn't properly handle multixacts
> that need to be frozen according to cutoff_multi, but whose updater xid
> is still alive. Instead of preserving the update Xid, it just set Xmax
> invalid, which leads to both old and new tuple versions becoming
> visible. This is pretty rare in practice, but a real threat
> nonetheless. Existing corrupted rows, unfortunately, cannot be repaired
> in an automated fashion.

I think this bug is worth mentioning here explicitly. As released, if
you have a table where some rows are updated using an xmax as multi, and
you freeze it you're pretty likely to experience corruption where you
see both the old and the new version of a tuple as live. I haven't seen
this one in the wild but just noticed it while looking at the other
freezing bug, but it's really quite easy to reproduce. As demonstrated
in the attached isolationtester spec which doubles the row count via an
UPDATE in 9.3/HEAD.

> + * Note the update Xid cannot possibly be older than
> + * cutoff_xid; if it were, we wouldn't be here: if committed,
> + * the tuple would have been pruned, and if aborted, the Xmax
> + * would have been marked Invalid by HeapTupleSatisfiesVacuum.
> + * (Not in-progress either, because then cutoff_xid would be
> + * newer.)

s/newer/older/?

> @@ -5644,24 +5729,54 @@ heap_tuple_needs_freeze(HeapTupleHeader tuple, TransactionId cutoff_xid,
> TransactionIdPrecedes(xid, cutoff_xid))
> return true;

Maybe add a comment referring to heap_freeze_tuple() for justification
of individual parts and that it needs to be kept in sync?

> + nmembers = GetMultiXactIdMembers(xid, &members, true);
> + for (i = 0; i < nmembers; i++)
> + {
> + TransactionId member = members[i].xid;
> +
> + Assert(TransactionIdIsNormal(member));
> +
> + /* we don't care about lockers */
> + if (!ISUPDATE_from_mxstatus(members[i].status))
> + continue;
> +
> + if (TransactionIdPrecedes(member, cutoff_xid))
> + {
> + pfree(members);
> + return true;
> + }
> + }

This now can use GetUpdateXid() as well.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment	Content-Type	Size
mxact-vacuum.spec	text/plain	771 bytes

From:	Kevin Grittner <kgrittn(at)ymail(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-29 21:26:00
Message-ID:	1385760360.65292.YahooMailNeo@web162901.mail.bf1.yahoo.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:

> New versions of all these patches, plus one more patch which
> removes the behavior that HeapTupleGetUpdateXid checks for
> aborted updates. (Turns out this was necessary to get freezing
> right.) My previous patch to avoid InvalidXid as page prune
> point is reverted in there, too (no longer necessary.)

Does this mean that when HeapTupleSatisfiesVacuum() returns
HEAPTUPLE_DELETE_IN_PROGRESS it is no longer possible for
HeapTupleHeaderGetUpdateXid() to return InvalidTransactionId?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-29 21:28:12
Message-ID:	20131129212812.GC14712@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2013-11-29 13:26:00 -0800, Kevin Grittner wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>
> > New versions of all these patches, plus one more patch which
> > removes the behavior that HeapTupleGetUpdateXid checks for
> > aborted updates. (Turns out this was necessary to get freezing
> > right.) My previous patch to avoid InvalidXid as page prune
> > point is reverted in there, too (no longer necessary.)
>
> Does this mean that when HeapTupleSatisfiesVacuum() returns
> HEAPTUPLE_DELETE_IN_PROGRESS it is no longer possible for
> HeapTupleHeaderGetUpdateXid() to return InvalidTransactionId?

Correct.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Andres Freund <andres(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-30 01:15:18
Message-ID:	20131130011518.GQ5513@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Okay, I have pushed all these patches, including the fixes suggested
here and then some.

Hats off to Andres, who handled all the bug analysis, figured out the
test cases that tickled things in all the wrong ways, and came up with
the patches. Whenever we meet again, let's make sure to find a real
good restaurant and let me invite you a nice dinner.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-30 15:57:43
Message-ID:	22536.1385827063@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Okay, I have pushed all these patches, including the fixes suggested
> here and then some.

Not sure exactly which patch caused it, but I'm getting a warning
in 9.0 through 9.2:

multixact.c:1553: warning: no previous prototype for 'TrimMultiXact'

regards, tom lane

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-30 15:58:26
Message-ID:	1385827106.27340.18.camel@vanquo.pezone.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, 2013-11-29 at 22:15 -0300, Alvaro Herrera wrote:
> Okay, I have pushed all these patches, including the fixes suggested
> here and then some.

Something in these patches is causing a new compiler warning in the 9.2
branch:

multixact.c:1553:1: warning: no previous prototype for
‘TrimMultiXact’ [-Wmissing-prototypes]

http://pgci.eisentraut.org/jenkins/job/postgresql_9.2_world/404/warnings19Result/new/?

Something might be wrong there, because that new function doesn't appear
to be called anywhere.

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-30 16:06:31
Message-ID:	20131130160631.GC31100@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2013-11-30 10:57:43 -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> > Okay, I have pushed all these patches, including the fixes suggested
> > here and then some.
>
> Not sure exactly which patch caused it, but I'm getting a warning
> in 9.0 through 9.2:
>
> multixact.c:1553: warning: no previous prototype for 'TrimMultiXact'

Hm. There's been a mistake during the back-patching of that
patch. Likely due to the difference in how multixacts work between 9.2
and 9.3.
Let me provide a fixup patch.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MultiXact truncation, startup et al.
Date:	2013-11-30 16:34:47
Message-ID:	20131130163447.GE31100@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2013-11-30 17:06:31 +0100, Andres Freund wrote:
> On 2013-11-30 10:57:43 -0500, Tom Lane wrote:
> > Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> > > Okay, I have pushed all these patches, including the fixes suggested
> > > here and then some.
> >
> > Not sure exactly which patch caused it, but I'm getting a warning
> > in 9.0 through 9.2:
> >
> > multixact.c:1553: warning: no previous prototype for 'TrimMultiXact'
>
> Hm. There's been a mistake during the back-patching of that
> patch. Likely due to the difference in how multixacts work between 9.2
> and 9.3.
> Let me provide a fixup patch.

Attached.

I've tested that pg_multixact gets cleaned up on both primary and a HS
standby, and that things continue work after failover.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment	Content-Type	Size
0001-Fix-incomplete-backpatch-of-the-pg_multixact-truncat.patch	text/x-patch	2.7 KB