Re: Checksums, state of play

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Checksums, state of play
Date: 2012-03-07 17:28:37
Message-ID: CA+TgmobuYVWx3+vWw6cLvPC0Y-kzzoAN2OVP1t2X5_biwH6agQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 6, 2012 at 2:27 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> The feature is no where near complete, and we should not be designing
> features at this stage.

I agree, on both counts. Although Simon did a good job pulling
together something that basically works in a short amount of time, the
edge cases still need a lot more thought, and work. Yesterday's
discussion was mostly about turning the feature on and off, which
certainly seems to be the most significant problem with the patch as
it stands. But there are also a number of other things that have been
discussed and not fully resolved, such as the performance impact of
WAL-logging hint bit changes, the exact way we're going to sandwhich
this into the page header, and the right way to handle the necessary
buffer locking. I think all of those issues can be resolved but it's
not going to happen in a day, and even once it does there will still
be other, smaller things that need to be cleaned up here and there.
Really measuring and fixing all of these issues will be a matter of
months, not weeks.

Simon seems to be proposing that, in lieu of spending too much more
time fixing this, we just commit it and document the known
limitations. I don't agree with that. In particular, I think the
idea of committing a checksum patch that can produce false positives
in the event of a torn page situation is a really bad idea. The whole
point of the patch is to distinguish between hardware failure and
software failure; if we can't reliably do that, I don't see this as
being much of an advance over the status quo. I think we're going to
find that the cost of WAL-logging hints is bad enough that people are
only going to do it when they already suspect a problem and want
confirmation. If they can't rely on that confirmation being real, as
opposed to an outgrowth of a known limitation of the feature, I don't
see the point. I'd much rather see this feature wait for 9.3 than
ship something that's unreliable in this regard.

So I think it's time to push this one out to 9.3.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-03-07 17:31:37 Re: poll: CHECK TRIGGER?
Previous Message Pavel Stehule 2012-03-07 17:28:20 Re: poll: CHECK TRIGGER?