Re: Database corruption?

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: alvherre(at)atentus(dot)com, pgsql-general(at)postgresql(dot)org, vmikheev(at)SECTORBASE(dot)COM
Subject: Re: Database corruption?
Date: 2001-10-31 00:58:04
Message-ID: 20011031095804H.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> It may be unthinkable hubris to say this, but ... I am starting to
> notice that a larger and larger fraction of serious trouble reports
> ultimately trace to hardware failures, not software bugs. Seems we've
> done a good job getting data-corruption bugs out of Postgres.
>
> Perhaps we should reconsider the notion of keeping CRC checksums on
> data pages. Not sure what we could do to defend against bad RAM,
> however.

Good idea.

I have been troubled by a really strange problem. Populating with huge
data (~7GB) cause random failures, for example a misterious unique
constaraint violation, count(*) shows incorrect number, pg_temp*
suddenly disappear (the table in question is a temporary table). These
are really hard to reproduce and happen on 7.0 to current, virtually
any PostgreSQL releases. Even on an identical system, the problems are
sometimes gone after re-initdb...

I now suspect that some hardware failures might be the source of the
trouble. Problem is, I see no sign so far from the standard system
logs, such as syslog or messages.

It would be really nice if PostgreSQL could be protected from such
hardware failures using CRC or whatever...
--
Tatsuo Ishii

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message danh 2001-10-31 01:16:29 fresh install of postgres 7.1 doesn't start postmaster with the "-i" flag
Previous Message Jason Earl 2001-10-30 22:40:29 Re: PostgreSQL dirver?