Re: Just-in-time Background Writer Patch+Test Results

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Just-in-time Background Writer Patch+Test Results
Date: 2007-09-07 15:48:42
Message-ID: Pine.GSO.4.64.0709071119140.16702@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 7 Sep 2007, Simon Riggs wrote:

> I think that is what we should be measuring, perhaps in a simple way
> such as calculating the 90th percentile of the response time
> distribution.

I do track the 90th percentile numbers, but in these pgbench tests where
I'm writing as fast as possible they're actually useless--in many cases
they're *smaller* than the average response, because there are enough
cases where there is a really, really long wait that they skew the average
up really hard. Take a look at any of the inidividual test graphs and
you'll see what I mean.

> Looking at the tps also tempts us to run a test which maxes out the
> server, an area we already know and expect the bgwriter to be unhelpful
> in.

I tried to turn that around and make my thinking be that if I built a
bgwriter that did most of the writes without badly impacting the measure
we know and expect it to be unhelpful in, that would be more likely to
yield a robust design. It kept me out of areas where I might have built
something that had to be disclaimed with "don't run this when the server
is maxed out".

> For me, the bgwriter should sleep for at most 10ms at a time. If it has
> nothing to do it can go straight back to sleep again. Trying to set that
> time is fairly difficult, so it would be better not to have to set it at
> all.

I wanted to get this patch out there so people could start thinking about
what I'd done and consider whether this still fit into the 8.3 timeline.
What I'm doing myself right now is running tests with a much lower setting
for the delay time--am testing 20ms right now. I personally would be
happy saying it's 10ms and that's it. Is anyone using a time lower than
that right now? I seem to recall that 10ms was also the shortest interval
Heikki used in his tests as well.

> I get the feeling that what we have here is better than what we had
> before, but I guess I'm a bit disappointed we still have 3 magic
> parameters, or 5 if you count your hard-coded ones also.

I may be able to eliminate more of them, but I didn't want to take them
out before beta. If it can be demonstrated that some of these parameters
can be set to specific values and still work across a wider range of
applications than what I've tested, then there's certainly room to fix
some of these, which actually makes some things easier. For example, I'd
be more confident fixing the weighted average smoothing period to a
specific number if I knew the delay was fixed, and there's two parameters
gone. And the multiplier is begging to be eliminated, just need some more
data to confirm that's true.

> There's still no formal way to tune these. As long as we have *any*
> magic parameters, we need a way to tune them in the field, or they are
> useless. At very least we need a plan for how people will report results
> during Beta. That means we need a log_bgwriter (better name, please...)
> parameter that provides information to assist with tuning.

Once I got past the "does it work?" stage, I've been doing all the tuning
work using a before/after snapshot of pg_stat_bgwriter data during a
representative snapshot of activity and looking at the delta. Been a
while since I actually looked into the logs for anything. It's very
straightforward to put together a formal tuning plan using the data in
there, particularly compared to the the impossibility of creating such a
plan in the current code.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2007-09-07 15:53:43 Re: HEAD build troubles, buildfarm misconfigurations
Previous Message Tom Lane 2007-09-07 15:38:52 Re: GIN readme is out of date