Re: Enabling Checksums

From: Jim Nasby <jim(at)nasby(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Daniel Farina <daniel(at)heroku(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enabling Checksums
Date: 2013-03-13 22:24:54
Message-ID: 5140FCB6.5020709@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/7/13 9:31 PM, Bruce Momjian wrote:
> 1 storage
> 2 storage controller
> 3 file system
> 4 RAM
> 5 CPU

I would add 2.5 in there: storage interconnect. iSCSI, FC, what-have-you. Obviously not everyone has that.

> My guess is that storage checksums only cover layer 1, while our patch
> covers layers 1-3, and probably not 4-5 because we only compute the
> checksum on write.

Actually, it depends. In our case, we run 512GB servers and 8GB shared buffers (previous testing has shown that anything much bigger than 8G hurts performance).

So in our case, PG checksums protect a very significant portion of #4.

> If that is correct, the open question is what percentage of corruption
> happens in layers 1-3?

The last bout of corruption we had was entirely coincident with memory failures. IIRC we had 3-4 corruption events on more than one server. Everything was running standard ECC (sadly, not 4-bit ECC).

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2013-03-13 23:51:28 Re: Using indexes for partial index builds
Previous Message Kevin Grittner 2013-03-13 21:45:01 Re: matview patch readability/correctness gripe