Re: CRC algorithm (was Re: [REVIEW] Re: Compression of full-page-writes)

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CRC algorithm (was Re: [REVIEW] Re: Compression of full-page-writes)
Date: 2014-09-16 10:49:20
Message-ID: 541815B0.2050006@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/16/2014 01:28 PM, Andres Freund wrote:
> On 2014-09-16 15:43:06 +0530, Amit Kapila wrote:
>> On Sat, Sep 13, 2014 at 1:33 AM, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
>> wrote:
>>> On 09/12/2014 10:54 PM, Abhijit Menon-Sen wrote:
>>>> At 2014-09-12 22:38:01 +0300, hlinnakangas(at)vmware(dot)com wrote:
>>>>> We probably should consider switching to a faster CRC algorithm again,
>>>>> regardless of what we do with compression.
>>>>
>>>> As it happens, I'm already working on resurrecting a patch that Andres
>>>> posted in 2010 to switch to zlib's faster CRC implementation.
>>>
>>> As it happens, I also wrote an implementation of Slice-by-4 the other day
>> :-).
>>> Haven't gotten around to post it, but here it is.
>>
>> Incase we are using the implementation for everything that uses
>> COMP_CRC32() macro, won't it give problem for older version
>> databases. I have created a database with Head code and then
>> tried to start server after applying this patch it gives below error:
>> FATAL: incorrect checksum in control file
>
> That's indicative of a bug. This really shouldn't cause such problems -
> at least my version was compatible with the current definition, and IIRC
> Heikki's should be the same in theory. If I read it right.
>
>> In general, the idea sounds quite promising. To see how it performs
>> on small to medium size data, I have used attached test which is
>> written be you (with some additional tests) during performance test
>> of WAL reduction patch in 9.4.
>
> Yes, we should really do this.
>
>> The patched version gives better results in all cases
>> (in range of 10~15%), though this is not the perfect test, however
>> it gives fair idea that the patch is quite promising. I think to test
>> the benefit from crc calculation for full page, we can have some
>> checkpoint during each test (may be after insert). Let me know
>> what other kind of tests do you think are required to see the
>> gain/loss from this patch.
>
> I actually think we don't really need this. It's pretty evident that
> slice-by-4 is a clear improvement.
>
>> I think the main difference in this patch and what Andres has
>> developed sometime back was code for manually unrolled loop
>> doing 32bytes at once, so once Andres or Abhijit will post an
>> updated version, we can do some performance tests to see
>> if there is any additional gain.
>
> If Heikki's version works I see little need to use my/Abhijit's
> patch. That version has part of it under the zlib license. If Heikki's
> version is a 'clean room', then I'd say we go with it. It looks really
> quite similar though... We can make minor changes like additional
> unrolling without problems lateron.

I used http://create.stephan-brumme.com/crc32/#slicing-by-8-overview as
reference - you can probably see the similarity. Any implementation is
going to look more or less the same, though; there aren't that many ways
to write the implementation.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-09-16 10:57:05 Re: CRC algorithm (was Re: [REVIEW] Re: Compression of full-page-writes)
Previous Message Andres Freund 2014-09-16 10:28:07 Re: CRC algorithm (was Re: [REVIEW] Re: Compression of full-page-writes)