Instrument checkpoint sync calls

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Instrument checkpoint sync calls
Date: 2010-11-14 21:37:58
Message-ID: 4CE056B6.4050901@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Attached patch adds some logging for each individual fsync call made
during a checkpoint, along with a summary at the end. You need to have
the following to see all of the detail:

log_checkpoints=on
log_min_messages=debug1

And here's a sample:

LOG: checkpoint starting: immediate force wait
DEBUG: checkpoint sync: file=1 time=1.946000 msec
DEBUG: checkpoint sync: file=2 time=0.666000 msec
DEBUG: checkpoint sync: file=3 time=0.004000 msec
LOG: checkpoint sync: files=3 longest=1.946000 msec average=0.872000 msec
LOG: checkpoint complete: wrote 3 buffers (0.1%); 0 transaction log
file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.002 s,
total=0.003 s

I think that it's reasonable for the sort of people who turn
log_checkpoints on to also get the sync summary line, thus it being
logged at LOG level. The detail on individual syncs might go to DEBUG2
or lower instead of DEBUG1 where I put it, that part I don't have a
strong opinion on. It's at DEBUG1 to make testing the patch without a
gigantic amount of log data also coming in easier.

Right now the code is written such that all the calls that grab timing
information are wrapped around "ifdef DEBUG_FSYNC", which is a variable
set to 1 that could be a compile-time option like DEBUG_DEADLOCK, to
allow turning this code path off at build time. I personally think that
if you're already making an fsync call and have log_checkpoints on, the
additional overhead of also timing that fsync is minimal even on
platforms where timing is slow (I don't have such a system to test that
assumption however). And I've seen enough value in troubleshooting
nasty checkpoint sync problems using this patch to feel it's worth
having even if it does add some overhead.

I'm a little concerned about log_checkpoints changing on me in the
middle of the execution of a checkpoint, which would cause some problems
here. Not sure if that's something I should explicitly code for, given
that all I think it will do is screw up one of the timing results. It
does seem a risk from the last minute self-review I just did of the code.

I'll give a sample program that stresses the system, generating slow
timing results and other types of bad behavior, along with the next
patch I submit here shortly.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us

Attachment Content-Type Size
log-sync-v2.patch text/x-patch 2.5 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2010-11-14 21:38:36 Re: wCTE behaviour
Previous Message Tom Lane 2010-11-14 21:36:50 Re: pg_stat_bgwriter broken?