Re: Expose checkpoint start/finish times into SQL.

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc: pgsql-patches(at)postgresql(dot)org, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Theo Schlossnagle <jesus(at)omniti(dot)com>
Subject: Re: Expose checkpoint start/finish times into SQL.
Date: 2008-04-04 04:09:31
Message-ID: Pine.GSO.4.64.0804032331110.13806@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

On Thu, 3 Apr 2008, Robert Treat wrote:

> You can plug a single item graphed over time into things like rrdtool to
> get good trending information. And it's often easier to do this using
> sql interfaces to get the data than pulling it out of log files (almost
> like the db was designed for that :-)

The pg_stat_bgwriter value for buffers_checkpoint was intentionally
implemented in 8.3 such that it jumps in one big lump when the checkpoint
is done. While it's not the ideal interface for what you're looking for,
the reason for that is to made it possible to build a "when was the last
checkpoint finished?" interface via some remote monitoring tool just by
determining the last time that the value jumped upwards. You can easily
see them just by graphing that value, it shouldn't be too hard to teach
something with rrdtool guts to find them.

Since checkpoints have a fairly predictable duration in 8.3, as long as
you catch the start or end of them you can make a resonable guess where
the other side was. The case you're trying to avoid here, the system
going a long time without checkpointing, can be implemented by looking for
a begin or end regularly, you don't need to track both. As long as
there's a checkpoint finish "pulse" in buffers_checkpoint showing up
regularly you're fine. The only situation I can think of where this might
be problematic is where the system has been idle enough to not have any
buffers to write at checkpoint time, but I recall a code path there where
checkpoints stop altogether unless there's been activity so even tracking
the time may not change that.

Ultimately a lot of the other questions you might ask (i.e. "how many
buffers have been written per hour by checkpoints?") require processing
the numbers in this way anyway, and I thought this implementation was good
enough to monitor the situation you're trying to avoid--presuming you're
using some sort of moderately powerful remote monitoring tool. Theo's
patch would make it easier to answer with a simple command which has some
value; a little SQL in a cron job would be good enough to trigger an alert
rather than needing a real monitoring probe.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2008-04-04 04:20:33 Re: Expose checkpoint start/finish times into SQL.
Previous Message Tom Lane 2008-04-04 03:58:30 Re: Expose checkpoint start/finish times into SQL.