Re: patch for new feature: Buffer Cache Hibernation

From: Mitsuru IWASAKI <iwasaki(at)jp(dot)FreeBSD(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us
Subject: Re: patch for new feature: Buffer Cache Hibernation
Date: 2011-05-05 10:10:35
Message-ID: 20110505.191035.71476022.iwasaki@jp.FreeBSD.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, thanks for good suggestions.

> > Postgres usually starts with ZERO buffer cache. By saving the buffer
> > cache data structure into hibernation files just before shutdown, and
> > loading them at startup, postgres can start operations with the saved
> > buffer cache as the same condition as just before the last shutdown.
>
> This seems like a lot of complication for rather dubious gain. What
> happens when the DBA changes the shared_buffers setting, for instance?

It was my first concern actually. Current implementation is stopping
reading hibernation file when detecting the size mismatch among
shared_buffers and hibernation file. I think it is a safety way.
As Alvaro Herrera mentioned, it would be possible to adjust copying
buffer bloks, but changing shared_buffers setting is not so often I
think.

> How do you protect against the cached buffers getting out-of-sync with
> the actual disk files (especially during recovery scenarios)? What

Saving DB buffer cahce is called at shutdown after finishing
bgwriter's final checkpoint process, so dirty-buffers should not exist
I believe.
For recovery scenarios, I need to research it though...
Could you describe what is need to be consider?

> about crash-induced corruption in the cache file itself (consider the
> not-unlikely possibility that init will kill the database before it's
> had time to dump all the buffers during a system shutdown)? Do you have

I think this is important point. I'll implement validation function for
hibernation file.

> any proof that writing out a few GB of buffers and then reading them
> back in is actually much cheaper than letting the database re-read the
> data from the disk files?

I think this means sequential-read vs scattered-read.
The largest hibernation file is for buffer blocks, and sequential-read
from it would be much faster than scattered-read from database file
via smgrread() block by block.
As Greg Stark suggested, re-reading from database file based on buffer
descriptors was one of implementation candidates (it can reduce
storage consumption for hibernation), but I chose creating buffer
blocks raw image file and reading it for the performance.

Thanks

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Teodor Sigaev 2011-05-05 11:06:39 Re: GSoC 2011: Fast GiST index build
Previous Message Andres Freund 2011-05-05 09:21:36 Backpatching of "Teach the regular expression functions to do case-insensitive matching"