Re: Database corruption?

Lists: pgsql-general
From: Alvaro Herrera <alvherre(at)atentus(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Database corruption?
Date: 2001-10-22 05:13:43
Message-ID: Pine.LNX.4.33L2.0110220136050.13022-100000@aguila.protecne.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hello:

I didn't think I would live to see a corrupted database, but I now have.
I don't know how it happened; it looks like some table got corrupted
during VACUUM and now the database won't start. The serverlog shows:

invoking IpcMemoryCreate(size=3203072)
FindExec: found "/usr/local/pgsql/bin/postmaster" using argv[0]
DEBUG: database system shutdown was interrupted at 2001-10-22 01:03:37 CLST
DEBUG: CheckPoint record at (13, 3399750448)
DEBUG: Redo record at (13, 3399750448); Undo record at (0, 0); Shutdown TRUE
DEBUG: NextTransactionId: 20960076; NextOid: 41447617
DEBUG: database system was not properly shut down; automatic recovery in progress...
DEBUG: redo starts at (13, 3399750512)
REDO @ 13/3399750512; LSN 13/3399750548: prev 13/3399750448; xprev 0/0; xid 20960086: XLOG - nextOid: 41455809
REDO @ 13/3399750548; LSN 13/3399758820: prev 13/3399750512; xprev 0/0; xid 20960086; bkpb 1: Heap - insert: node 16283895/16287107; tid 333/97
REDO @ 13/3399758820; LSN 13/3399767092: prev 13/3399750548; xprev 13/3399750548; xid 20960086; bkpb 1: Btree - insert: node 16283895/23651833; tid 195/257
REDO @ 13/3399767092; LSN 13/3399767164: prev 13/3399758820; xprev 13/3399758820; xid 20960086: Heap - insert: node 16283895/16287107; tid 333/98
REDO @ 13/3399767164; LSN 13/3399767228: prev 13/3399767092; xprev 13/3399767092; xid 20960086: Btree - insert: node 16283895/23651833; tid 195/258
REDO @ 13/3399767228; LSN 13/3399767300: prev 13/3399767164; xprev 13/3399767164; xid 20960086: Heap - insert: node 16283895/16287107; tid 333/99
REDO @ 13/3399767300; LSN 13/3399767364: prev 13/3399767228; xprev 13/3399767228; xid 20960086: Btree - insert: node 16283895/23651833; tid 195/259
REDO @ 13/3399767364; LSN 13/3399767448: prev 13/3399767300; xprev 13/3399767300; xid 20960086: Heap - update: node 16283895/16287620; tid 428/79; new 428/129
REDO @ 13/3399767448; LSN 13/3399775720: prev 13/3399767364; xprev 13/3399767364; xid 20960086; bkpb 1: Btree - insert: node 16283895/23651923; tid 4/2
REDO @ 13/3399775720; LSN 13/3399775780: prev 13/3399767448; xprev 13/3399767448; xid 20960086: Btree - insert: node 16283895/23651926; tid 233/6
DEBUG: ReadRecord: record with zero len at (13, 3399775780)
DEBUG: redo done at (13, 3399775720)
XLogFlush: rqst 13/3399767300; wrt 13/3399775780; flsh 13/3399775780
XLogFlush: rqst 13/3399767364; wrt 13/3399775780; flsh 13/3399775780
XLogFlush: rqst 13/3400103600; wrt 13/3399775780; flsh 13/3399775780
FATAL 2: XLogFlush: request is not satisfied
DEBUG: proc_exit(2)
DEBUG: shmem_exit(2)
DEBUG: exit(2)
/usr/local/pgsql/bin/postmaster: reaping dead processes...
/usr/local/pgsql/bin/postmaster: Startup proc 3855 exited with status 512 - abort
/usr/local/pgsql/bin/postmaster: PostmasterMain: initial environ dump:
[blah]

And here is a backtrace taken from a core file I found laying around,
which has a timestamp makes me think it has something to say:

(gdb) bt
#0 0x4018cbf4 in memmove () from /lib/libc.so.6
#1 0x08100f85 in PageRepairFragmentation ()
#2 0x080ae9a7 in scan_heap ()
#3 0x080adfb4 in vacuum_rel ()
#4 0x080adbee in vac_vacuum ()
#5 0x080adb68 in vacuum ()
#6 0x08105c72 in ProcessUtility ()
#7 0x081039d9 in pg_exec_query_string ()
#8 0x08104adb in PostgresMain ()
#9 0x080ee7e4 in DoBackend ()
#10 0x080ee3c5 in BackendStartup ()
#11 0x080ed599 in ServerLoop ()
#12 0x080ecfa6 in PostmasterMain ()
#13 0x080ccb8f in main ()
#14 0x401231f0 in __libc_start_main () from /lib/libc.so.6

The database has been running for months without trouble. I'm now trying
desperate measures, but I fear I will have to restore from backup (a week
old). I have taken a tarball of the complete location (pg_xlog included and
all that stuff) if anyone wants to see it (but it's 2 GB).

I think I know what table is dead, but I don't know what to do with that
information :-( In the serverlog, I see

DEBUG: --Relation delay_171--
NOTICE: Rel delay_171: TID 15502/4279: OID IS INVALID. TUPGONE 0.
NOTICE: Rel delay_171: TID 15502/4291: OID IS INVALID. TUPGONE 1.
NOTICE: Rel delay_171: TID 15502/4315: OID IS INVALID. TUPGONE 1.
NOTICE: Rel delay_171: TID 15502/4375: OID IS INVALID. TUPGONE 0.
NOTICE: Rel delay_171: TID 15502/4723: OID IS INVALID. TUPGONE 1.
NOTICE: Rel delay_171: TID 15502/4771: OID IS INVALID. TUPGONE 0.
NOTICE: Rel delay_171: TID 15502/4783: OID IS INVALID. TUPGONE 0.
NOTICE: Rel delay_171: TID 15502/4831: OID IS INVALID. TUPGONE 1.
NOTICE: Rel delay_171: TID 15502/4843: OID IS INVALID. TUPGONE 0.
NOTICE: Rel delay_171: TID 15502/4867: InsertTransactionInProgress 0 - can't shrink relation
NOTICE: Rel delay_171: TID 15502/4867: OID IS INVALID. TUPGONE 0.
[a lot similarly looking lines]
NOTICE: Rel delay_171: TID 15502/6067: OID IS INVALID. TUPGONE 0.
Server process (pid 22773) exited with status 139 at Sun Oct 21 02:30:27 2001
Terminating any active server processes...
NOTICE: Message from PostgreSQL backend:
The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory.
I have rolled back the current transaction and am going to terminate your database system connection and exit.
Please reconnect to the database system and repeat your query.

(this is way before the lines from the serverlog I showed earlier)

I really don't know what to do from here.

--
Alvaro Herrera (<alvherre[(at)]atentus(dot)com>)
"La rebeldia es la virtud original del hombre" (Arthur Schopenhauer)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)atentus(dot)com>
Cc: pgsql-general(at)postgresql(dot)org, Vadim Mikheev <vmikheev(at)sectorbase(dot)com>
Subject: Re: Database corruption?
Date: 2001-10-23 00:14:49
Message-ID: 28353.1003796089@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Alvaro Herrera <alvherre(at)atentus(dot)com> writes:
> FATAL 2: XLogFlush: request is not satisfied

We had a previous report of this same failure message --- see
the thread starting at
http://fts.postgresql.org/db/mw/msg.html?mid=1033586

> And here is a backtrace taken from a core file I found laying around,
> which has a timestamp makes me think it has something to say:

> (gdb) bt
> #0 0x4018cbf4 in memmove () from /lib/libc.so.6
> #1 0x08100f85 in PageRepairFragmentation ()
> #2 0x080ae9a7 in scan_heap ()
> #3 0x080adfb4 in vacuum_rel ()
> #4 0x080adbee in vac_vacuum ()
> #5 0x080adb68 in vacuum ()

It would be useful to look into that too, for sure, but I think it is
probably not related to your XLog problem.

> The database has been running for months without trouble. I'm now trying
> desperate measures, but I fear I will have to restore from backup (a week
> old). I have taken a tarball of the complete location (pg_xlog included and
> all that stuff) if anyone wants to see it (but it's 2 GB).

As I said to Denis in the earlier thread, it would be good to try to
track down which page is corrupted and maybe then we'd understand how
it got that way. Since you have the database tarball, you have the
raw material to look into it --- you'd need to rebuild Postgres with
debug symbols enabled and trace back from the failure points to learn
more. Are you up to that, or could you grant access to your machine to
someone who is?

As for your immediate problem, I'd counsel reducing that elog(STOP) to
elog(DEBUG) so that you can bring the database up, and then you can
try to pg_dump your current data. You'll probably still want to
re-initdb and restore once you get a consistent dump.

Um, Vadim? Still of the opinion that elog(STOP) is a good idea here?
That's two people now for whom that decision has turned localized
corruption into complete database failure. I don't think it's a good
tradeoff.

regards, tom lane


From: Alvaro Herrera <alvherre(at)atentus(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-general(at)postgresql(dot)org>, Vadim Mikheev <vmikheev(at)SECTORBASE(dot)COM>
Subject: Re: Database corruption?
Date: 2001-10-23 03:15:30
Message-ID: Pine.LNX.4.33L2.0110230009030.11257-100000@aguila.protecne.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Mon, 22 Oct 2001, Tom Lane wrote:

> Alvaro Herrera <alvherre(at)atentus(dot)com> writes:
> > FATAL 2: XLogFlush: request is not satisfied
>
> We had a previous report of this same failure message --- see
> the thread starting at
> http://fts.postgresql.org/db/mw/msg.html?mid=1033586

All right, I'll give it a try next week.

> Since you have the database tarball, you have the
> raw material to look into it --- you'd need to rebuild Postgres with
> debug symbols enabled and trace back from the failure points to learn
> more. Are you up to that, or could you grant access to your machine to
> someone who is?

I am. I'll rebuild and then maybe you'll see what gives.

> Um, Vadim? Still of the opinion that elog(STOP) is a good idea here?
> That's two people now for whom that decision has turned localized
> corruption into complete database failure. I don't think it's a good
> tradeoff.

Well, if there's this reset xlog bussiness I don't think that'd be
necessary: just point it out. Had I known... but now the problem is
solved.

--
Alvaro Herrera (<alvherre[(at)]atentus(dot)com>)
"Cuando no hay humildad las personas se degradan" (A. Christie)


From: Alvaro Herrera <alvherre(at)atentus(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-general(at)postgresql(dot)org>, Vadim Mikheev <vmikheev(at)SECTORBASE(dot)COM>
Subject: Re: Database corruption?
Date: 2001-10-30 15:51:15
Message-ID: Pine.LNX.4.33L2.0110301248280.409-100000@aguila.protecne.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Mon, 22 Oct 2001, Tom Lane wrote:

> Alvaro Herrera <alvherre(at)atentus(dot)com> writes:

> > The database has been running for months without trouble. I'm now trying
> > desperate measures, but I fear I will have to restore from backup (a week
> > old). I have taken a tarball of the complete location (pg_xlog included and
> > all that stuff) if anyone wants to see it (but it's 2 GB).
>
> As I said to Denis in the earlier thread, it would be good to try to
> track down which page is corrupted and maybe then we'd understand how
> it got that way. Since you have the database tarball, you have the
> raw material to look into it --- you'd need to rebuild Postgres with
> debug symbols enabled and trace back from the failure points to learn
> more. Are you up to that, or could you grant access to your machine to
> someone who is?

The problem ended up being bad RAM (it was good two weeks ago). I don't
think it's of any use to do anything else with the database. Sorry for
the noise.

If you think it's useful anyway, I can recompile and see what's up. Do
you think so?

--
Alvaro Herrera (<alvherre[(at)]atentus(dot)com>)
"Nadie esta tan esclavizado como el que se cree libre no siendolo" (Goethe)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)atentus(dot)com>
Cc: pgsql-general(at)postgresql(dot)org, Vadim Mikheev <vmikheev(at)SECTORBASE(dot)COM>
Subject: Re: Database corruption?
Date: 2001-10-30 16:03:13
Message-ID: 20715.1004457793@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Alvaro Herrera <alvherre(at)atentus(dot)com> writes:
> The problem ended up being bad RAM (it was good two weeks ago). I don't
> think it's of any use to do anything else with the database.

Ah so. Thanks for following up.

It may be unthinkable hubris to say this, but ... I am starting to
notice that a larger and larger fraction of serious trouble reports
ultimately trace to hardware failures, not software bugs. Seems we've
done a good job getting data-corruption bugs out of Postgres.

Perhaps we should reconsider the notion of keeping CRC checksums on
data pages. Not sure what we could do to defend against bad RAM,
however.

regards, tom lane


From: "Mitch Vincent" <mvincent(at)cablespeed(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Database corruption?
Date: 2001-10-30 16:48:33
Message-ID: 004201c16162$b67b2ea0$1e51000a@mitch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> It may be unthinkable hubris to say this, but ... I am starting to
> notice that a larger and larger fraction of serious trouble reports
> ultimately trace to hardware failures, not software bugs. Seems we've
> done a good job getting data-corruption bugs out of Postgres.

I can say from personal experience that since 7.0 I haven't had a single
instance of corruption... I had quite a few troubles with pre-7.0 version
but you guys have kicked some serious butt in getting those bugs out of PG..

As always, excellent work gentlemen..

-Mitch


From: Jon Lapham <lapham(at)extracta(dot)com(dot)br>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general(at)postgresql(dot)org
Subject: Re: Database corruption?
Date: 2001-10-30 17:27:50
Message-ID: 3BDEE316.7010100@extracta.com.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Tom Lane wrote:

> Alvaro Herrera <alvherre(at)atentus(dot)com> writes:
>
>>The problem ended up being bad RAM (it was good two weeks ago). I don't
>>think it's of any use to do anything else with the database.
>>
>
> Ah so. Thanks for following up.
>
> It may be unthinkable hubris to say this, but ... I am starting to
> notice that a larger and larger fraction of serious trouble reports
> ultimately trace to hardware failures, not software bugs. Seems we've
> done a good job getting data-corruption bugs out of Postgres.
>

I am certainly a member of the "hardware cause db corruption" family, as
it has happened to me more than once. RAM and power supplies seem to be
the usual suspects.

Has anyone else noticed that only a few years ago it seemed like
hardware was more robust? Or is that just fanciful projections of the
"good ol' days"?

--

-**-*-*---*-*---*-*---*-----*-*-----*---*-*---*-----*-----*-*-----*---
Jon Lapham
Extracta Moléculas Naturais, Rio de Janeiro, Brasil
email: lapham(at)extracta(dot)com(dot)br web: http://www.extracta.com.br/
***-*--*----*-------*------------*--------------------*---------------


From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: alvherre(at)atentus(dot)com, pgsql-general(at)postgresql(dot)org, vmikheev(at)SECTORBASE(dot)COM
Subject: Re: Database corruption?
Date: 2001-10-31 00:58:04
Message-ID: 20011031095804H.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> It may be unthinkable hubris to say this, but ... I am starting to
> notice that a larger and larger fraction of serious trouble reports
> ultimately trace to hardware failures, not software bugs. Seems we've
> done a good job getting data-corruption bugs out of Postgres.
>
> Perhaps we should reconsider the notion of keeping CRC checksums on
> data pages. Not sure what we could do to defend against bad RAM,
> however.

Good idea.

I have been troubled by a really strange problem. Populating with huge
data (~7GB) cause random failures, for example a misterious unique
constaraint violation, count(*) shows incorrect number, pg_temp*
suddenly disappear (the table in question is a temporary table). These
are really hard to reproduce and happen on 7.0 to current, virtually
any PostgreSQL releases. Even on an identical system, the problems are
sometimes gone after re-initdb...

I now suspect that some hardware failures might be the source of the
trouble. Problem is, I see no sign so far from the standard system
logs, such as syslog or messages.

It would be really nice if PostgreSQL could be protected from such
hardware failures using CRC or whatever...
--
Tatsuo Ishii


From: Alvaro Herrera <alvherre(at)atentus(dot)com>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-general(at)postgresql(dot)org>, <vmikheev(at)SECTORBASE(dot)COM>
Subject: Re: Database corruption?
Date: 2001-10-31 01:20:31
Message-ID: Pine.LNX.4.33L2.0110302216150.16301-100000@aguila.protecne.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 31 Oct 2001, Tatsuo Ishii wrote:

> > It may be unthinkable hubris to say this, but ... I am starting to
> > notice that a larger and larger fraction of serious trouble reports
> > ultimately trace to hardware failures, not software bugs. Seems we've
> > done a good job getting data-corruption bugs out of Postgres.
> >
> > Perhaps we should reconsider the notion of keeping CRC checksums on
> > data pages. Not sure what we could do to defend against bad RAM,
> > however.

Maybe not defend against it, but at least you can detect and warn the
user that something is likely to go wrong.

> I have been troubled by a really strange problem. Populating with huge
> data (~7GB) cause random failures, for example a misterious unique
> constaraint violation, count(*) shows incorrect number, pg_temp*
> suddenly disappear (the table in question is a temporary table).

Remember the guy who had to change relnatts by hand to get a table back
on line? It was bad RAM. One may wonder just how big the coincidence was
to get exactly that bit changed... Well, a bad CRC checksum would've
warned him right away.

--
Alvaro Herrera (<alvherre[(at)]atentus(dot)com>)
"Si quieres ser creativo, aprende el arte de perder el tiempo"


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, alvherre(at)atentus(dot)com, pgsql-general(at)postgresql(dot)org, vmikheev(at)SECTORBASE(dot)COM
Subject: Re: Database corruption?
Date: 2001-10-31 01:30:28
Message-ID: 200110310130.f9V1USC02306@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> I have been troubled by a really strange problem. Populating with huge
> data (~7GB) cause random failures, for example a misterious unique
> constaraint violation, count(*) shows incorrect number, pg_temp*
> suddenly disappear (the table in question is a temporary table). These
> are really hard to reproduce and happen on 7.0 to current, virtually
> any PostgreSQL releases. Even on an identical system, the problems are
> sometimes gone after re-initdb...
>
> I now suspect that some hardware failures might be the source of the
> trouble. Problem is, I see no sign so far from the standard system
> logs, such as syslog or messages.
>
> It would be really nice if PostgreSQL could be protected from such
> hardware failures using CRC or whatever...

At a minimum, for cases where hardware problems are suspected, we should
have some CRC detection code we can turn on.

Tatsuo, does --enable-cassert help?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: pgman(at)candle(dot)pha(dot)pa(dot)us
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, alvherre(at)atentus(dot)com, pgsql-general(at)postgresql(dot)org, vmikheev(at)SECTORBASE(dot)COM
Subject: Re: Database corruption?
Date: 2001-10-31 01:34:07
Message-ID: 20011031103407N.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> At a minimum, for cases where hardware problems are suspected, we should
> have some CRC detection code we can turn on.
>
> Tatsuo, does --enable-cassert help?

Not trying yet. Insted we are running Purify to see if it detects
something going wrong. It will take long time...
--
Tatsuo Ishii


From: "Dr(dot) Evil" <drevil(at)sidereal(dot)kz>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Database corruption?
Date: 2001-10-31 01:42:59
Message-ID: 20011031014259.30742.qmail@sidereal.kz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> > Perhaps we should reconsider the notion of keeping CRC checksums on
> > data pages. Not sure what we could do to defend against bad RAM,
> > however.
>
> Good idea.

I third that. Machines are very fast today. Reliability is vastly
more important than speed in many cases. Let's say that CRC
introduces a 20% slowdown (which is probably an overstatement). This
means that, to get the same speed, we will have to spend 20% more on
the hardware. Let's say that this is a top-end server, so we have to
spend $12k instead of $10k. Is $2k worth it for better data
reliability? Absolutely! No question about it.


From: Andrew Snow <andrew(at)modulus(dot)org>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Database corruption?
Date: 2001-10-31 01:47:51
Message-ID: 20011031124539.S68176-100000@esper.modulus.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


Error-correcting RAM modules for PCs are becoming very cheap these days.. I
think anyone doing any serious work on a computer should be using them now
to avoid problems like this.

- Andrew


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Andrew Snow <andrew(at)modulus(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Database corruption?
Date: 2001-10-31 02:18:42
Message-ID: 200110310218.f9V2IhM06805@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

>
> Error-correcting RAM modules for PCs are becoming very cheap these days.. I
> think anyone doing any serious work on a computer should be using them now
> to avoid problems like this.

I think any error-detection code we add would be optional and default to
off. Also, rememeber it is error-detection, usually not error
correction.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026