Re: pg_prewarm

From: Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_prewarm
Date: 2012-03-10 20:49:54
Message-ID: AF9AF9CA-AE1F-409F-B569-E5DA715D7286@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mar 9, 2012, at 2:34 PM, Robert Haas wrote:

> On Fri, Mar 9, 2012 at 5:42 AM, Hans-Jürgen Schönig
> <postgres(at)cybertec(dot)at> wrote:
>> we had some different idea here in the past: what if we had a procedure / method to allow people to save the list of current buffers / cached blocks to be written to disk (sorted). we could then reload this "cache profile" on startup in the background or people could load a certain cache content at runtime (maybe to test or whatever).
>> writing those block ids in sorted order would help us to avoid some random I/O on reload.
>
> I don't think that's a bad idea at all, and someone actually did write
> a patch for it at one point, though it didn't get committed, partly I
> believe because of technical issues and partly because Greg Smith was
> uncertain how much good it did to restore shared_buffers without
> thinking about the OS cache. Personally, I don't buy into the latter
> objection: a lot of people are running with data sets that fit inside
> shared_buffers, and those people would benefit tremendously.
>
> However, this just provides mechanism, not policy, and is therefore
> more general. You could use pg_buffercache to save the cache contents
> at shutdown and pg_prewarm to load those blocks back in at startup, if
> you were so inclined. Or if you just want to load up your main
> relation, and its indexes, you can do that, too.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

i also think that it can be beneficial.
once in a while people ask how to "bring a database up to speed" after a restart. i have seen more than one case when a DB was close to death after a restart because random I/O was simply killing it during cache warmup. it seems the problem is getting worse as we see machines with more and more RAM in the field.
technically i would see a rather brute force approach: if we just spill out of the list of blocks we got in shared buffer atm (not content of course, just physical location sorted by file / position in file) it would be good enough. if a block physically does not exist on reload any more it would not even be an issue and allow people basically to "snapshot" their cache status. we could allow named cache profiles or so and make a GUC to indicate of one of them should be preloaded on startup (background or beforehand - i see usecases for both approaches).

yes, somehow linking to pg_buffercache makes a lot of sense. maybe just extending it with some extra functions is already enough for most cases.

hans

--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Stefan Keller 2012-03-10 21:35:14 Re: pg_prewarm
Previous Message Marti Raudsepp 2012-03-10 20:00:43 Refactoring simplify_function (was: Caching constant stable expressions)