Re: Enabling Checksums

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Ants Aasma <ants(at)cybertec(dot)at>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Florian Pflug <fgp(at)phlo(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enabling Checksums
Date: 2013-04-18 02:08:10
Message-ID: 516F558A.7020907@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/17/13 8:56 PM, Ants Aasma wrote:
> Nothing from the two points, but the CRC calculation algorithm can be
> switched out for slice-by-4 or slice-by-8 variant. Speed up was around
> factor of 4 if I remember correctly...I can provide you
> with a patch of the generic version of any of the discussed algorithms
> within an hour, leaving plenty of time in beta or in 9.4 to
> accommodate the optimized versions.

Can you nail down a solid, potential for commit slice-by-4 or slice-by-8
patch then? You dropped into things like per-byte overhead to reach
this conclusion, which was fine to let the methods battle each other.
Maybe I missed it, but I didn't remember seeing an obvious full patch
for this implementation then come back up from that. With the schedule
pressure this needs to return to more database-level tests. Your
concerns about the committed feature being much slower then the original
Fletcher one are troubling, and we might as well do that showdown again
now with the best of the CRC implementations you've found.

> Actually the state is that with the [CRC] polynomial used there is
> currently close to zero hope of CPUs optimizing for us.

Ah, I didn't catch that before. It sounds like the alternate slicing
implementation should also use a different polynomial then, which sounds
reasonable. This doesn't even have to be exactly the same CRC function
that the WAL uses. A CRC that's modified for performance or having a
better future potential is fine; there's just a lot of resistance to
using something other than a CRC right now.

> I'm not sure about the 9.4 part: if we ship with the builtin CRC as
> committed, there is a 100% chance that we will want to switch out the
> algorithm in 9.4, and there will be quite a large subset of users that
> will find the performance unusable.

Now I have to switch out my reviewer hat for my 3 bit fortune telling
one. (It uses a Magic 8 Ball) This entire approach is squeezing what
people would prefer to be a 32 bit CRC into a spare 16 bits, as a useful
step advancing toward a long term goal. I have four major branches of
possible futures here I've thought about:

1) Database checksums with 16 bits are good enough, but they have to be
much faster to satisfy users. It may take a different checksum
implementation altogether to make that possible, and distinguishing
between the two of them requires borrowing even more metadata bits from
somewhere. (This seems the future you're worried about)

2) Database checksums work out well, but they have to be 32 bits to
satisfy users and/or error detection needs. Work on pg_upgrade and
expanding the page headers will be needed. Optimization of the CRC now
has a full 32 bit target.

3) The demand for database checksums is made obsolete by either
mainstream filesystem checksumming, performance issues, or just general
market whim. The 16 bit checksum PostgreSQL implements becomes a
vestigial feature, and whenever it gets in the way of making changes
someone proposes eliminating them. (I call this one the "rules" future)

4) 16 bit checksums turn out to be such a problem in the field that
everyone regrets the whole thing, and discussions turn immediately
toward how to eliminate that risk.

It's fair that you're very concerned about (1), but I wouldn't give it
100% odds of happening either. The user demand that's motivated me to
work on this will be happy with any of (1) through (3), and in two of
them optimizing the 16 bit checksums now turns out to be premature.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2013-04-18 02:33:16 Re: confusing message about archive failures
Previous Message Peter Eisentraut 2013-04-18 02:02:12 confusing message about archive failures