Re: pg_multixact not getting truncated

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_multixact not getting truncated
Date: 2014-11-04 01:24:04
Message-ID: 54582AB4.8060105@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/03/2014 05:06 PM, Alvaro Herrera wrote:
> Josh Berkus wrote:
>> Hackers,
>>
>> I'm looking at a couple of high-transaction-rate and high-FK-conflict
>> rate servers where pg_multixact has grown to be more than 1GB in size.
>> One such server doesn't appear to be having any notable issues with
>> vacuuming, and the oldest mxid on the system is about 47m old. VACUUM
>> FREEZEing the oldest databases did not cause the pg_multixact dir to get
>> smaller --- it may have even caused it to get larger.
>>
>> Why would pg_multixact not be truncating? Does it never truncate files
>> with aborted multixacts in them? Might we have another multixact bug?
>
> Have a look at the MultiXactId values in pg_controldata, datminmxid in
> pg_database, and relminmxid in pg_class. They should advance as you
> VACUUM FREEZE. If it's stuck at 1, you might be in pg_upgrade trouble
> if you used 9.3.4 or earlier to upgrade.

They're advancing. I also know for a fact that this system was not
pg_upgraded, because I set it up. It's always been on 9.3.5.

So after some vacuum freezing and a checkpoint, the oldestMultiXid
advanced by 9 million, about 20% of the gap with current multiXid.
However, that did not delete 20% of the files; before the Freeze there
were 4414 files, and now there are 4195. So advancing 9m mxids resulted
in only deleting ~~130 files. So I think we definitely have an issue here.

BTW, the reason I started poking into this was a report from a user that
they have a pg_multixact directory which is 21GB in size, and is 2X the
size of the database.

Here's XID data:

Latest checkpoint's NextXID: 0/1126461940
Latest checkpoint's NextOID: 371135838
Latest checkpoint's NextMultiXactId: 162092874
Latest checkpoint's NextMultiOffset: 778360290
Latest checkpoint's oldestXID: 945761490
Latest checkpoint's oldestXID's DB: 370038709
Latest checkpoint's oldestActiveXID: 1126461940
Latest checkpoint's oldestMultiXid: 123452201
Latest checkpoint's oldestMulti's DB: 370038709

Oldest mxid file is 29B2, newest is 3A13

No tables had a relminmxid of 1 (some of the system tables were 0,
though), and the data from pg_class and pg_database is consistent.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2014-11-04 01:40:36 Re: pg_multixact not getting truncated
Previous Message Tom Lane 2014-11-04 01:10:05 Re: Pipelining executions to postgresql server