Re: Protecting against unexpected zero-pages: proposal

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Protecting against unexpected zero-pages: proposal
Date: 2010-11-09 15:25:23
Message-ID: AANLkTi=mfepkzVBPySd9faugJvjdkg=7A-2NsX3tOvZx@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 9, 2010 at 2:28 PM, Aidan Van Dyk <aidan(at)highrise(dot)ca> wrote:
> On Tue, Nov 9, 2010 at 8:45 AM, Greg Stark <gsstark(at)mit(dot)edu> wrote:
>
>> But buffering the page only means you've got some consistent view of
>> the page. It doesn't mean the checksum will actually match the data in
>> the page that gets written out. So when you read it back in the
>> checksum may be invalid.
>
> I was assuming that if the code went through the trouble to buffer the
> shared page to get a "stable, non-changing" copy to use for
> checksumming/writing it, it would write() the buffered copy it just
> made, not the original in shared memory...  I'm not sure how that
> write could be in-consistent.

Oh, I'm mistaken. The problem was that buffering the writes was
insufficient to deal with torn pages. Even if you buffer the writes if
the machine crashes while only having written half the buffer out then
the checksum won't match. If the only changes on the page were hint
bit updates then there will be no full page write in the WAL log to
repair the block.

It's possible that *that* situation is rare enough to let the checksum
raise a warning but not an error.

But personally I'm pretty loath to buffer every page write. The state
of the art are zero-copy processes and we should be looking to reduce
copies rather than increase them. Though I suppose if we did a
zero-copy CRC that might actually get us this buffered write for free.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2010-11-09 15:27:53 Re: Protecting against unexpected zero-pages: proposal
Previous Message Tom Lane 2010-11-09 15:14:09 Re: CLUSTER can change t_len