Re: Just-in-time Background Writer Patch+Test Results

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Just-in-time Background Writer Patch+Test Results
Date: 2007-09-18 15:51:12
Message-ID: Pine.GSO.4.64.0709181105470.27154@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

It was suggested to me today that I should clarify how others should be
able to test this patch themselves by writing a sort of performance
reviewer's guide; that information has been scattered among material
covering development. That's what you'll find below. Let me know if any
of it seems confusing and I'll try to clarify. I'll be checking my mail
and responding intermitantly while I'm away, just won't be able to run any
tests myself until next week.

The latest version of the background writer code that I've been reporting
on is attached to the first message in this thread:

http://archives.postgresql.org/pgsql-hackers/2007-09/msg00214.php

I haven't found any reason so far to update that code, the existing
exposed tunables still appear sufficient for all the situations I've
found.

Track Buffer Allocations and Cleaner Efficiency
-----------------------------------------------

First you apply the patch inside buf-alloc-2.patch.gz , which adds several
entries to pg_stat_bgwriter; it applied cleanly to HEAD at the point when
I generated it. I'd suggest testing that one to collect baseline
information with the current background writer, and to confirm that the
overhead of tracking the buffer allocations by itself doesn't cause a
performance hit, before applying the second patch. I keep two clusters
going on the same port, one with just buf-alloc-2, one with both patches,
to be able to make such comparisions, only having one active at a time.
You'll need to run initdb to create a database with the new stats in it
after applying the patch.

What I've been doing to test the effectiveness of any LRU background
writer method using this patch is take a before/after snapshot of
pg_stat_bgwriter. Then I compute the delta during the test run in order
to figure what percentage of buffers were written by the background writer
vs. the client backends; that's the number I'm reporting as cleaner_pct in
my tests. Here is an example of how to compute that against all
transactions in pg_stat_bgwriter:

select round(buffers_clean * 10000 / (buffers_backend + buffers_clean)) /
100 as cleaner_pct from pg_stat_bgwriter;

You should also monitor maxwritten_clean to make sure you've set
bgwriter_lru_maxpages high enough that it's not limiting writes. You can
always turn the background writer off by setting maxpages to 0 (it's the
only way to do so after applying the below patch).

For reference, the exact code I'm using to save the deltas and compute
everything is available within pgbench-tools-0.2 at
http://www.westnet.com/~gsmith/content/postgresql/pgbench-tools.htm

The code inside the benchwarmer script uses a table called test_bgwriter
(schema in init/resultdb.sql), populates it before the test, then computes
the delta afterwards. bufsummary.sql generates the results I've been
putting in my messages. I assume there's a cleaner way to compute just
these numbers by resetting the statistics before the test instead, but
that didn't fit into what I was working towards.

New Background Writer Logic
---------------------------

The second patch in jit-cleaner.patch.gz applies on top of buf-alloc-2.
It modifies the LRU background writer with the just-in-time logic as I
described in the message the patches were attached to. The main tunable
there is bgwriter_lru_multiplier, which replaces bgwriter_lru_percent.
The effective range seems to be 1.0 to 3.0. You can take an existing 8.3
postgresql.conf, rename bgwriter_lru_percent to bgwriter_lru_multiplier,
adjust the value to be in the right range, and then it will work with this
patched version.

For comparing the patched vs. original BGW behavior, I've taken to keeping
definitions for both variables in a common postgresql.conf, and then I
just comment/uncomment the one I need based on which version I'm running:

bgwriter_lru_multiplier = 1.0
#bgwriter_lru_percent = 5

The main thing I've noticed so far is that as you decrease bgwriter_delay
from the default of 200ms, the multiplier has needed to be larger to
maintain the same cleaner percentage in my tests.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2007-09-18 15:58:55 Re: Open issues for HOT patch
Previous Message Tom Lane 2007-09-18 15:32:52 Re: Open issues for HOT patch