Re: Block-level CRC checks

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Block-level CRC checks
Date: 2008-11-13 18:45:52
Message-ID: 19934.1226601952@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> I'm still testing this; please beware that this likely has an even
> higher bug density than my regular patches (and some debugging printouts
> as well).

This seems impossibly fragile ... and the non-modular assumptions about
what is in a disk page aren't even the worst part :-(. The worst part
is the race conditions.

In particular, the code added to FlushBuffer effectively assumes that
the PD_UNLOGGED_CHANGE bit is set sooner than the actual hint bit change
occurs. Even if the tqual.c code did that in the right order, which it
doesn't, you can't assume that the updates will become visible to other
CPUs in the expected order. This might be fixable with introduction of
some memory barrier operations but it's certainly broken as-is.

Also, if you do make tqual.c set the bits in that order, it's not clear
how you can ever *clear* PD_UNLOGGED_CHANGE without introducing a race
condition at that end. (The patch actually neglects to do this anywhere,
which means that it won't be long till every page in the DB has got that
bit set all the time, which I don't think we want.)

I also don't like that you've got different CPUs trying to set or clear
the same PD_UNLOGGED_CHANGE bit with no locking. We can tolerate that
for ordinary hint bits because it's not critical if an update gets lost.
But in this scheme PD_UNLOGGED_CHANGE is not an optional hint bit: you
*will* mess up if it fails to get set. Even more insidiously, the
scheme will certainly fail if someone ever tries to add another
asynchronously-updated hint bit in pd_flags, since an update of one of
the bits might overwrite a concurrent update of the other. Also, it's
not inconceivable (depending on how wide the processor/memory bus is)
that one processor updating PD_UNLOGGED_CHANGE could overwrite some
other processor's change to the nearby pd_checksum or pd_lsn or pd_tli
fields.

Basically, you can't make any critical changes to a shared buffer
if you haven't got exclusive lock on it. But that's exactly what
this patch is assuming it can do.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2008-11-13 18:46:50 Re: Simple postgresql.conf wizard
Previous Message Gregory Stark 2008-11-13 18:43:22 Re: Simple postgresql.conf wizard