Re: Enabling Checksums

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Ants Aasma <ants(at)cybertec(dot)at>, Florian Pflug <fgp(at)phlo(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enabling Checksums
Date: 2013-04-23 15:28:33
Message-ID: CA+U5nMKVEu8UDXQe+Nk=d7Nqm4ypFSzaEf0esaK4j31LyQCOBQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 23 April 2013 02:35, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Tue, 2013-04-23 at 01:08 +0300, Ants Aasma wrote:
>> A slight delay, but here it is. I didn't lift the checksum part into a
>> separate file as I didn't have a great idea what I would call it. The
>> code is reasonably compact so I don't see a great need for this right
>> now. It would be more worth the effort when/if we add non-generic
>> variants. I'm not particularly attached to the method I used to mask
>> out pd_checksum field, this could be improved if someone has a better
>> idea how to structure the code.
>
> Thank you. A few initial comments:
>
> I have attached (for illustration purposes only) a patch on top of yours
> that divides the responsibilities a little more cleanly.
>
> * easier to move into a separate file, and use your recommended compiler
> flags without affecting other routines in bufpage.c
> * makes the checksum algorithm itself simpler
> * leaves the data-page-specific aspects (mixing in the page number,
> ignoring pd_checksum, reducing to 16 bits) to PageCalcChecksum16
> * overall easier to review and understand
>
> I'm not sure what we should call the separate file or where we should
> put it, though. How about src/backend/utils/checksum/checksum_fnv.c? Is
> there a clean way to override the compiler flags for a single file so we
> don't need to put it in its own directory?

OK, I like that a lot better and it seems something I could commit.

I suggest the following additional changes...

* put the README stuff directly in the checksum.c file
* I think we need some external links that describe this algorithm
and we need comments that explain what we know about this in terms of
detection capability and why it was chosen against others
* We need some comments/analysis about whether the coding causes a
problem if vectorization is *not* available

* make the pg_control.data_checksums field into a version number, for
future flexibility...
patch attached

* rename the routine away from checksum_fnv so its simply a generic
checksum call - more modular.

That way all knowledge of the algorithm is in one file only. If we do
need to change the algorithm in the future we can more easily support
multiple versions.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
checksums_version.v1.patch application/octet-stream 4.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Manning 2013-04-23 15:30:59 Proposal to add --single-row to psql
Previous Message Alvaro Herrera 2013-04-23 15:08:34 Re: Performance with the new security release?