Help: my table is corrupt!

Lists: pgsql-admin
From: Jos van Roosmalen <josr(at)josr(dot)org>
To: pgsql-admin(at)postgresql(dot)org
Subject: Help: my table is corrupt!
Date: 2005-03-27 15:19:33
Message-ID: 67AC5485-F0F9-4118-A053-A1E5D7807D9D@josr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Hello,

Can someone explain me how to repair a table?

If I execute a 'select * from <brokentable>' in PSQL I get the next
error. A pg_dump export only partially export the broken table and
ends with the same error:

"Invalid memory alloc request size: 4294967290"

Note that this number is really close to 2^32.

I am running x86 Linux (Gentoo), with Postgresql 7.4.

This is what I tried to repair the table and all didn't work:

- Turning ZERO_DAMAGED_PAGES to true in postgresql.conf;
- Running REINDEX TABLE <brokentable>, and REINDEX DATABASE <thedb>
- Running VACUUM <brokentable>

This table contains arround 6 million records, and every record only
contains a few text, number and date attributes.

Queries/Exports on other tables works fine.

Is there a Postgresql Repair application which delete or repair
corrupt records?

Thanks in advance!

Jos


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jos van Roosmalen <josr(at)josr(dot)org>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Help: my table is corrupt!
Date: 2005-03-27 16:23:00
Message-ID: 5730.1111940580@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Jos van Roosmalen <josr(at)josr(dot)org> writes:
> Can someone explain me how to repair a table?

> If I execute a 'select * from <brokentable>' in PSQL I get the next
> error. A pg_dump export only partially export the broken table and
> ends with the same error:

> "Invalid memory alloc request size: 4294967290"

What you've probably got is a variable-width field whose length word
is corrupted. The best procedure is to identify and delete the broken
row (or rows; it's likely that the corruption hit more than just one
byte). You can find more details about this in the archives but the
short answer is that you do this by trial-and-error. I'd do something
like:

1. Determine how far into the table the bad value is, by finding the
smallest N such that
select * from tab offset N limit 1
fails. This is a bit tedious but if you understand the principle of
binary search (a/k/a divide and conquer) it won't take long.

2. Find a way to name that row, and delete it. ctid always works:
select ctid from tab offset N limit 1
delete from tab where ctid = 'that value'
You might also select as much as you can out of the row before you
delete it, to find out what it is you are losing. Usually, you'll
be able to select all the columns to the left of the corrupted one.
select a,b,c from tab where ctid = 'that value'

3. Can you pg_dump the table yet? If not, return to step 1.

This assumes that you just want to get rid of the bad data ASAP
and not try to do any forensics to discover what happened. If you
want to study the corruption in detail you could use pg_filedump
on the damaged page. The first component of ctid is the page number.

> Is there a Postgresql Repair application which delete or repair
> corrupt records?

I've not seen anything that really looks at the data contents of rows.

regards, tom lane