Re: Enabling Checksums

From: Ants Aasma <ants(at)cybertec(dot)at>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enabling Checksums
Date: 2013-04-18 18:50:25
Message-ID: CA+CSw_tH9d0WPz-JtUwa+zfOn-imtS_GgPxMC=VoHyPu9hFThA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 18, 2013 at 9:11 PM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
> On 18.04.2013, at 20:02, Ants Aasma <ants(at)cybertec(dot)at> wrote:
>> On Thu, Apr 18, 2013 at 8:24 PM, Ants Aasma <ants(at)cybertec(dot)at> wrote:
>>> On Thu, Apr 18, 2013 at 8:15 PM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
>>>> So either the CRC32-C polynomial isn't irreducible, or there something
>>>> fishy going on. Could there be a bug in your CRC implementation? Maybe
>>>> a mixup between big and little endian, or something like that?
>>>
>>> I'm suspecting an implementation bug myself. I already checked the
>>> test harness and that was all sane, compiler hadn't taken any
>>> unforgivable liberties there. I will crosscheck the output with other
>>> implementations to verify that the checksum is implemented correctly.
>>
>> Looks like the implementation is correct. I cross-referenced it
>> against a bitwise algorithm for crc32 with the castagnoli polynomial.
>> This also rules out any endianness issues as the bitwise variant
>> consumes input byte at a time.
>>
>> What ever it is, it is something specific to PostgreSQL page layout.
>> If I use /dev/urandom as the source the issue disappears. So much for
>> CRC32 being proven good.
>
> Weird. Is the code of your test harness available publicly, or could you post it? I'd like to look into this...

Mystery solved. It was a bug in the test harness. If a page was
partially zero the cut-point wasn't correctly excluded from the
all-zero suffix, when overwriting the zero suffix correctly gave a
checksum match it was counted as a false positive. It didn't pop up on
other algorithms because for other algorithms I used a lot more data
and so the partial page false positives were drowned out. With this
fixed all algorithms give reasonably good detection rates for partial
writes.

The (now correct) testsuite is attached. Compile check-detection.c,
others files are included from there. See defines above the main
function for parameters. Please excuse the code being a hodgepodge of
thrown together snippets. For test data I used all files from a fresh
pg-9.3 database loaded with the IMDB dataset, including vm and fsm
pages.

Sorry about the false alarm.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

Attachment Content-Type Size
checksums-fnv.c text/x-csrc 5.5 KB
8x256_tables.c text/x-csrc 30.0 KB
checksums-crc32c.c text/x-csrc 1.7 KB
check-detection.c text/x-csrc 6.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2013-04-18 21:20:08 Re: Enabling Checksums
Previous Message Heikki Linnakangas 2013-04-18 18:11:29 Recovery target 'immediate'