Re: pg_prewarm

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: cedric(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, Stefan Keller <sfkeller(at)gmail(dot)com>
Subject: Re: pg_prewarm
Date: 2012-07-07 19:35:33
Message-ID: CAMkU=1xba7Hbw+Wf28mQ1a-PqDf5hwo7x1xUAz8anSczqKJvHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This is a review for pg_prewarm V2.

It applies (with some fuzz, but it is handled correctly) and builds cleanly.

It includes docs, but does not include regression tests, which it
probably should (just to verify that it continues to compile and
execute without throwing errors, I wouldn't expect an automated test
to verify actual performance improvement).

I think we want this. There is some discussion about how much overlap
it has with pgfincore, but I don't think there is an active proposal
to put that into contrib, so don't see that as blocking this.

It works as advertised. using pgbench -i -s100 (about 1.5Gig), with
shared_buffers of current default (128 MB), it takes 10 minutes for
pgbench -S to revive the cache from a cold start and reach its full
TPS. If I use pg_prewarm on both pgbench_accounts and
pgbench_accounts_pkey from a cold start, it takes 22 seconds, and then
pgbench -S runs at full speed right from the start.

It does not matter if I use 'read' or 'buffer'. While all the data
doesn't fit in shared_buffers, trying to read it into the buffers acts
to populate the file cache anyway, and doesn't take significantly more
time.

On my test system (openSuse 12.1) 'prefetch' took just as long 'read'
or 'buffer', and sometimes it seemed to fail to load everything (it
would take pgbench -S up to 60 seconds to reach full speed). I expect
this to be too system depend to care much about figuring what is going
on, though.

For the implementation:

1)
I think that for most users, making them know or care about forks and
block numbers is too much. It would be nice if there were a
single-argument form: pg_prewarm(relation) which loaded all of either
main, or all of all forks, using 'buffer'. This seems like a good
default. Also, the last two arguments are NULL in all the given
examples. Do we expect those to be used only for experimental
purposes by hackers, or are those of general interest?

2)
The error message:
ERROR: prewarm type cannot be null

Should offer the same hint as:

ERROR: invalid prewarm type
HINT: Valid prewarm types are "prefetch", "read", and "buffer".

3)
In the docs, the precedence seems to be that fork names ('main', here)
when in LITERAL classes are shown with single quotes around them,
rather than bare.

4) Not sure that the copyright should start in 2010 in pg_prewarm.c:
Copyright (c) 2010-2012

I have not tested on a system which does not support posix_fadvise.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2012-07-07 19:46:11 Re: Schema version management
Previous Message Tom Lane 2012-07-07 18:46:28 regex_fixed_prefix() is still a few bricks shy of a load