Re: Enabling Checksums

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enabling Checksums
Date: 2013-03-07 03:27:53
Message-ID: 51380939.7030901@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/6/13 6:34 AM, Heikki Linnakangas wrote:
> Another thought is that perhaps something like CRC32C would be faster to
> calculate on modern hardware, and could be safely truncated to 16-bits
> using the same technique you're using to truncate the Fletcher's
> Checksum. Greg's tests showed that the overhead of CRC calculation is
> significant in some workloads, so it would be good to spend some time to
> optimize that. It'd be difficult to change the algorithm in a future
> release without breaking on-disk compatibility, so let's make sure we
> pick the best one.

Simon sent over his first rev of this using a quick to compute 16 bit
checksum as a reasonable trade-off, one that it's possible to do right
now. It's not optimal in a few ways, but it catches single bit errors
that are missed right now, and Fletcher-16 computes quickly and without
a large amount of code. It's worth double-checking that the code is
using the best Fletcher-16 approach available. I've started on that,
but I'm working on your general performance concerns first, with the
implementation that's already there.

From what I've read so far, I think picking Fletcher-16 instead of the
main alternative, CRC-16-IBM AKA CRC-16-ANSI, is a reasonable choice.
There's a good table showing the main possibilities here at
https://en.wikipedia.org/wiki/Cyclic_redundancy_check

One day I hope that in-place upgrade learns how to do page format
upgrades, with the sort of background conversion tools and necessary
tracking metadata we've discussed for that work. When that day comes, I
would expect it to be straightforward to upgrade pages from 16 bit
Fletcher checksums to 32 bit CRC-32C ones. Ideally we would be able to
jump on the CRC-32C train today, but there's nowhere to put all 32 bits.
Using a Fletcher 16 bit checksum for 9.3 doesn't prevent the project
from going that way later though, once page header expansion is a solved
problem.

The problem with running CRC32C in software is that the standard fast
approach uses a "slicing" technique that requires a chunk of
pre-computed data be around, a moderately large lookup table. I don't
see that there's any advantage to having all that baggage around if
you're just going to throw away half of the result anyway. More on
CRC32Cs in my next message.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2013-03-07 03:30:50 Re: Enabling Checksums
Previous Message Ian Pilcher 2013-03-07 03:16:19 Trust intermediate CA for client certificates