Re: Disk corruption detection

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Florian Weimer <fw(at)deneb(dot)enyo(dot)de>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Disk corruption detection
Date: 2006-06-12 15:53:46
Message-ID: 20060612155346.GX34196@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sun, Jun 11, 2006 at 07:42:55PM +0200, Florian Weimer wrote:
> We recently had a partially failed disk in a RAID-1 configuration
> which did not perform a write operation as requested. Consequently,
> the mirrored disks had different contents, and the file which
> contained the block switched randomly between two copies, depending on
> which disk had been read. (In theory, it is possible to read always
> from both disks, but this is not what RAID-1 configurations normally
> do.)

Actually, every RAID1 I've ever used will read from both to try and
balance out the load.

> Anyway, how would be the chances for PostgreSQL to detect such a
> corruption on a heap or index data file? It's typically hard to
> detect this at the application level, so I don't expect wonders. I'm
> just curious if using PostgreSQL would have helped to catch this
> sooner.

I know that WAL pages are (or at least were) CRC'd, because there was
extensive discussion around 32 bit vs 64 bit CRCs. There is no such
check for data pages, although PostgreSQL has other ways to detect
errors. But in a nutshell, if you care about your data, buy hardware you
can trust.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message John Sidney-Woollett 2006-06-12 16:02:09 Re: Ever increasing OIDs - gonna run out soon?
Previous Message Jim C. Nasby 2006-06-12 15:50:45 Re: Ever increasing OIDs - gonna run out soon?