BUG #4929: Corrupted pg_class, possibly truncate/rollback related

Lists: pgsql-bugs
From: "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #4929: Corrupted pg_class, possibly truncate/rollback related
Date: 2009-07-20 22:46:14
Message-ID: 200907202246.n6KMkE46057083@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 4929
Logged by: Robert Treat
Email address: xzilla(at)users(dot)sourceforge(dot)net
PostgreSQL version: 8.3.1
Operating system: Linux 2.6.18-53.1.14.el5 #1 SMP x86_64 GNU/Linux
(Centos)
Description: Corrupted pg_class, possibly truncate/rollback related
Details:

Howdy,

I wanted to see if a situation I've encountered is a known issue or perhaps
worth more investigation. I'm working on a Postgres 8.3.1 system, where I've
uncovered some catalog corruption. The problems started with a failing
pg_dump...

pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: more than one row returned by a
subquery used as an expression
pg_dump: The command was: SELECT tableoid, oid, typname, typnamespace,
(SELECT rolname FROM pg_catalog.pg_roles WHERE oid = typowner) as rolname,
typinput::oid as typinput, typoutput::oid as typoutput
, typelem, typrelid, CASE WHEN typrelid = 0 THEN ' '::"char" ELSE (SELECT
relkind FROM pg_class WHERE oid = typrelid) END as typrelkind, typtype,
typisdefined, typname[0] = '_' AND typelem != 0 AND (SELECT typarray FROM
pg_type te WHERE oid = pg_type.typelem) = oid AS isarray FROM pg_type

I tracked this down to this problem in pg_class

pe2=# select xmin, xmax, cmin, cmax, ctid, oid, relnamespace, relname,
reltype, relowner, relfilenode, relpages, reltuples from pg_class where oid
= 23708;
xmin | xmax | cmin | cmax | ctid | oid | relnamespace |
relname | reltype | relowner | relfilenode | relpages | reltuples
------------+------------+------+------+----------+-------+--------------+--
--------+---------+----------+-------------+----------+-------------
3291061347 | 0 | 6 | 6 | (1118,2) | 23708 | 23681 |
prooln_m | 23710 | 10 | 654963 | 114055 | 7.42746e+06
2 | 3291061347 | 6 | 6 | (23,39) | 23708 | 23681 |
prooln_m | 23710 | 10 | 181519 | 104401 | 6.5017e+06

So, clearly this is bad. It seems we've gotten some level of corruption on
disk. The most perculiar bits of information around this system are that we
tend to have long running vacuum jobs (multiple days), and we recently did a
truncate + rollback within a transaction on the table in question. I've also
noticed that the file on disk for the 181519 row is not actually there. Oh,
and this does run on lvm, though we haven't used the lvm feature set for a
long time. So, worth investigating?


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: xzilla(at)users(dot)sourceforge(dot)net ("Robert Treat"), pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4929: Corrupted pg_class, possibly truncate/rollback related
Date: 2009-07-20 23:05:22
Message-ID: 874ot7q7kd.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

>>>>> "Robert" == "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net> writes:

Robert> Bug reference: 4929
Robert> Logged by: Robert Treat
Robert> Email address: xzilla(at)users(dot)sourceforge(dot)net
Robert> PostgreSQL version: 8.3.1

pe2=# select xmin, xmax, cmin, cmax, ctid, oid, relnamespace, relname,
reltype, relowner, relfilenode, relpages, reltuples from pg_class where oid = 23708;
xmin | xmax | cmin | cmax | ctid | oid | relnamespace | relname | reltype | relowner | relfilenode | relpages | reltuples
------------+------------+------+------+----------+-------+--------------+----------+---------+----------+-------------+----------+-------------
3291061347 | 0 | 6 | 6 | (1118,2) | 23708 | 23681 | prooln_m | 23710 | 10 | 654963 | 114055 | 7.42746e+06
2 | 3291061347 | 6 | 6 | (23,39) | 23708 | 23681 | prooln_m | 23710 | 10 | 181519 | 104401 | 6.5017e+06

Robert> So, clearly this is bad. It seems we've gotten some level of
Robert> corruption on disk. The most perculiar bits of information
Robert> around this system are that we tend to have long running
Robert> vacuum jobs (multiple days), and we recently did a truncate +
Robert> rollback within a transaction on the table in question. I've
Robert> also noticed that the file on disk for the 181519 row is not
Robert> actually there. Oh, and this does run on lvm, though we
Robert> haven't used the lvm feature set for a long time. So, worth
Robert> investigating?

I did some analysis on this at Robert's request on IRC. Here are hexdumps
of the offending tuples:

(1118,2):

00001e60 63 9c 29 c4 00 00 00 00 06 00 00 00 00 00 5e 04 |c.)...........^.|
00001e70 02 00 1b 00 0b 29 20 ff ff ff 03 00 9c 5c 00 00 |.....) ......\..|
00001e80 70 72 6f 6f 6c 6e 5f 6d 00 00 00 00 00 00 00 00 |prooln_m........|
00001e90 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001ec0 81 5c 00 00 9e 5c 00 00 0a 00 00 00 00 00 00 00 |.\...\..........|
00001ed0 73 fe 09 00 00 00 00 00 87 bd 01 00 02 ab e2 4a |s..............J|
00001ee0 00 00 00 00 00 00 00 00 01 00 72 00 0f 00 00 00 |..........r.....|
00001ef0 00 00 00 00 00 00 00 00 00 01 00 00 42 9c 29 c4 |............B.).|
00001f00 5b 01 00 00 00 00 00 00 00 09 04 00 00 02 00 00 |[...............|
00001f10 00 01 00 00 00 0a 00 00 00 0a 00 00 00 6f 00 00 |.............o..|
00001f20 00 15 5a 02 00 0a 00 00 00 02 00 00 00 00 00 00 |..Z.............|

Everything above looks about as we expect.

(1118,1): this is an earlier version of the tuple, correctly marked dead,
presumably resulting from the truncate/rollback referred to above:

00001f30 19 55 23 c4 00 00 00 00 06 00 00 00 00 00 5e 04 |.U#...........^.|
00001f40 01 00 1b 00 0b 2a 20 ff ff ff 03 00 9c 5c 00 00 |.....* ......\..|
00001f50 70 72 6f 6f 6c 6e 5f 6d 00 00 00 00 00 00 00 00 |prooln_m........|
00001f60 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001f90 81 5c 00 00 9e 5c 00 00 0a 00 00 00 00 00 00 00 |.\...\..........|
00001fa0 72 fe 09 00 00 00 00 00 00 00 00 00 00 00 00 00 |r...............|
00001fb0 00 00 00 00 00 00 00 00 01 00 72 00 0f 00 00 00 |..........r.....|
00001fc0 00 00 00 00 00 00 00 00 00 01 00 00 19 55 23 c4 |.............U#.|
00001fd0 5b 01 00 00 00 00 00 00 00 09 04 00 00 02 00 00 |[...............|
00001fe0 00 01 00 00 00 0a 00 00 00 0a 00 00 00 6f 00 00 |.............o..|
00001ff0 00 15 5a 02 00 0a 00 00 00 02 00 00 00 00 00 00 |..Z.............|

Again, no obvious surprises.

This is (23,39) which is the broken one; note HEAP_XMAX_INVALID is
set, despite the fact that this is a transaction that committed (as
evidenced by the removal of the old relfilenode) so the row is showing
up incorrectly to queries. Note also that HEAP_ONLY_TUPLE is set (but
there are no other versions of this tuple on page 23).

00000a20 02 00 00 00 63 9c 29 c4 06 00 00 00 00 00 5e 04 |....c.).......^.|
00000a30 02 00 1b 80 0b 29 20 ff ff ff 03 00 9c 5c 00 00 |.....) ......\..|
00000a40 70 72 6f 6f 6c 6e 5f 6d 00 00 00 00 00 00 00 00 |prooln_m........|
00000a50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000a80 81 5c 00 00 9e 5c 00 00 0a 00 00 00 00 00 00 00 |.\...\..........|
00000a90 0f c5 02 00 00 00 00 00 d1 97 01 00 8a 6a c6 4a |.............j.J|
00000aa0 00 00 00 00 00 00 00 00 01 00 72 00 0f 00 00 00 |..........r.....|
00000ab0 00 00 00 00 00 00 00 00 00 01 00 00 c5 52 c5 b9 |.............R..|
00000ac0 5b 01 00 00 00 00 00 00 00 09 04 00 00 02 00 00 |[...............|
00000ad0 00 01 00 00 00 0a 00 00 00 0a 00 00 00 6f 00 00 |.............o..|
00000ae0 00 15 5a 02 00 0a 00 00 00 02 00 00 00 00 00 00 |..Z.............|

--
Andrew (irc:RhodiumToad)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4929: Corrupted pg_class, possibly truncate/rollback related
Date: 2009-07-20 23:54:10
Message-ID: 29698.1248134050@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net> writes:
> PostgreSQL version: 8.3.1
> Description: Corrupted pg_class, possibly truncate/rollback related

FWIW, there is a bug fix in 8.3.4 addressing possible HOT-chain
corruption after a REINDEX of pg_class. Not sure if that could've been
the issue here.

regards, tom lane


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us (Tom Lane), "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4929: Corrupted pg_class, possibly truncate/rollback related
Date: 2009-07-21 00:59:53
Message-ID: 87prbuq29i.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net> writes:
>> PostgreSQL version: 8.3.1
>> Description: Corrupted pg_class, possibly truncate/rollback related

Tom> FWIW, there is a bug fix in 8.3.4 addressing possible HOT-chain
Tom> corruption after a REINDEX of pg_class. Not sure if that
Tom> could've been the issue here.

I asked Robert about that when we were discussing it on IRC, and he
said that no such REINDEX had been done (and nor had ALTER TABLE
RENAME and ALTER TABLE SET SCHEMA been done on the affected table).

However I'm wondering if another 8.3.4 fix, the RecentGlobalXmin one,
could be relevant here?
http://archives.postgresql.org/pgsql-committers/2008-09/msg00105.php
(I'm not seeing how it would be, but... note that the xids have got
to the point that they'd appear to be in the past from the point of
view of FirstNormalTransactionId)

--
Andrew (irc:RhodiumToad)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4929: Corrupted pg_class, possibly truncate/rollback related
Date: 2009-07-21 01:48:29
Message-ID: 2196.1248140909@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> However I'm wondering if another 8.3.4 fix, the RecentGlobalXmin one,
> could be relevant here?
> http://archives.postgresql.org/pgsql-committers/2008-09/msg00105.php
> (I'm not seeing how it would be, but... note that the xids have got
> to the point that they'd appear to be in the past from the point of
> view of FirstNormalTransactionId)

Hmmm ... I think you're onto something. My original speculation at
http://archives.postgresql.org/message-id/29544.1221061979@sss.pgh.pa.us
included a worry that the HOT patch could try to use RecentGlobalXmin
before it had gotten set. In particular, before the patch you mention
above, the InitPostgres transaction would be running around and doing
quite a lot of system catalog access with an unset RecentGlobalXmin.
If it happened to try to execute heap_page_prune on a pg_class page,
there would be a risk of deciding that some valid tuple wasn't valid.
I don't have time to try to trace through the logic and see if this
explains Robert's problem, though.

regards, tom lane