Re: Enabling Checksums

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Ants Aasma <ants(at)cybertec(dot)at>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enabling Checksums
Date: 2013-03-15 13:08:45
Message-ID: 20130315130845.GC21834@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-03-15 14:32:57 +0200, Ants Aasma wrote:
> On Wed, Mar 6, 2013 at 1:34 PM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
> > Fletcher's checksum is good in general, I was mainly worried about
> > truncating the Fletcher-64 into two 8-bit values. I can't spot any obvious
> > weakness in it, but if it's indeed faster and as good as a straightforward
> > Fletcher-16, I wonder why that method is not more widely used.
>
> As implented, the fletcher algorithm as implemented results in:
>
> checksum low byte = (blkno + sum over i [0..N) (x_i)) % 255 + 1
> checksum high byte = (blkno + sum over i in [0..N) ((N - i)*x_i)) % 255 + 1
>
> Where N is the number of 4 bytes words in the page and x_i is the i-th
> word. As modular arithmetic is a ring, it is easy to show that any
> addition or subtraction of a multiple of 255 = 0xFF will result in no
> change to the resulting value. The most obvious case here is that you
> can swap any number of bytes from 0x00 to 0xFF or back without
> affecting the hash.

I commented on this before, I personally think this property makes fletcher a
not so good fit for this. Its not uncommon for parts of a block being all-zero
and many disk corruptions actually change whole runs of bytes.

We could try to mess with this by doing an unsigned addition for each byte we
checksum. Increment the first byte by 0, the second one by 1, ... and then wrap
around at 254 again. That would allow us to detect changes of multiple bytes
that swap from all-zero to all-ones or viceversa.

I think we should just try to use some polynom of CRC32 and try to get that
fast though.

Even without taking advantage of vectorization and such you can get a good,
good bit faster than our current
implementation. E.g. http://archives.postgresql.org/message-id/201005202227.49990.andres%40anarazel.de

I still think changing the polynom to Castagnoli makes sense... Both from a
performance and from an error detection perspective.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2013-03-15 14:19:14 Re: lock AccessShareLock on object 0/1260/0 is already held
Previous Message Ants Aasma 2013-03-15 12:34:51 Re: Enabling Checksums