Re: corrupt pages detected by enabling checksums

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: corrupt pages detected by enabling checksums
Date: 2013-04-30 21:54:08
Message-ID: 1367358848.9300.21.camel@sussancws0025
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2013-04-30 at 08:34 -0400, Robert Haas wrote:
> Uh, wait a minute. I think this is completely wrong. The buffer is
> LOCKED for this entire sequence of operations. For a checkpoint to
> "happen", it's got to write every buffer, which it will not be able to
> do for so long as the buffer is locked.

I went back and forth on this, so you could be right, but here was my
reasoning:

I was worried because SyncOneBuffer checks whether it needs writing
without taking a content lock, so the exclusive lock doesn't help. That
makes sense, because you don't want a checkpoint to have to get a
content lock on every buffer in the buffer pool. But it also means we
need to follow the rules laid out in transam/README and dirty the pages
before writing WAL.

> The effect of the change to lazy_scan_heap is to force the buffer to
> be written even if we're only updating the visibility map page.
> That's a bad idea and should be reverted.

The only time the VM and the data page are out of sync during vacuum is
after a crash, right? If that's the case, I didn't think it was a big
deal to dirty one extra page (should be extremely rare). Am I missing
something?

The reason I removed that special case was just code
complexity/readability. I tried preserving the previous behavior, and
it's not so bad, but it seemed unnecessarily ugly for the benefit of a
rare case.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Koterov 2013-04-30 22:24:51 Incomplete description of pg_start_backup?
Previous Message Martijn van Oosterhout 2013-04-30 21:26:22 Re: Substituting Checksum Algorithm (was: Enabling Checksums)