Re: Protecting against unexpected zero-pages: proposal

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Protecting against unexpected zero-pages: proposal
Date: 2010-11-09 13:45:01
Message-ID: AANLkTimTwMwEBtod=gjufKtTC=_YBJ+Ei=aGV2-8d157@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 8, 2010 at 5:59 PM, Aidan Van Dyk <aidan(at)highrise(dot)ca> wrote:
> The problem that putting checksums in a different place solves is the
> page layout (binary upgrade) problem.  You're still doing to need to
> "buffer" the page as you calculate the checksum and write it out.
> buffering that page is absolutely necessary no mater where you put the
> checksum, unless you've got an exclusive lock that blocks even hint
> updates on the page.

But buffering the page only means you've got some consistent view of
the page. It doesn't mean the checksum will actually match the data in
the page that gets written out. So when you read it back in the
checksum may be invalid.

I wonder if we could get by by having some global counter on the page
which you increment when you set a hint bit. That way when we you read
the page back in you could compare the counter on the page and the
counter for the checksum and if the checksum counter is behind ignore
the checksum? It would be nice to do better but I'm not sure we can.

>
> But if we can start using forks to put "other data", that means that
> keeping the page layouts is easier, and thus binary upgrades are much
> more feasible.
>

The difficulty with the page layout didn't come from the checksum
itself. We can add 4 or 8 bytes to the page header easily enough. The
difficulty came from trying to move the hint bits for all the tuples
to a dedicated area. That means three resizable areas so either one of
them would have to be relocatable or some other solution (like not
checksumming the line pointers and putting the hint bits in the line
pointers). If we're willing to have invalid checksums whenever the
hint bits get set then this wouldn't be necessary.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2010-11-09 13:57:55 Re: CLUSTER can change t_len
Previous Message Fujii Masao 2010-11-09 13:44:33 Re: timestamp of the last replayed transaction