Re: PATCH: regular logging of checkpoint progress

Lists: pgsql-hackers
From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: PATCH: regular logging of checkpoint progress
Date: 2011-08-25 20:57:11
Message-ID: 10c944c3d316b1e791010313b1341c51.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I'd like to propose a small patch that allows better checkpoint progress
monitoring. The patch is quite simple - it adds a new integer GUC
"checkpoint_update_limit" and every time checkpoint writes this number of
buffers, it does two things:

(a) logs a "checkpoint status" message into the server log, with info
about total number of buffers to write, number of already written buffers,
current and average write speed and estimate of remaining time

(b) sends bgwriter stats (so that the buffers_checkpoint is updated)

I believe this will make checkpoint tuning easier, especially with large
shared bufferers and large when there's other write activity (so that it's
difficult to see checkpoint I/O).

The default value (0) means this continuous logging is disabled.

Tomas

Attachment Content-Type Size
checkpoint-stats-update.diff text/plain 6.6 KB

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-25 22:18:15
Message-ID: 4E56CA27.1050202@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tomas,

> I'd like to propose a small patch that allows better checkpoint progress
> monitoring. The patch is quite simple - it adds a new integer GUC
> "checkpoint_update_limit" and every time checkpoint writes this number of
> buffers, it does two things:

I'd rather not have a new GUC if we can avoid it. What about just
making this some reasonable value (like 1000 buffers) if log_checkpoints
= on?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Josh Berkus" <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-25 22:39:14
Message-ID: 5932c410d45252aca18b91de3bf0ceed.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 Srpen 2011, 0:18, Josh Berkus wrote:
> Tomas,
>
>> I'd like to propose a small patch that allows better checkpoint progress
>> monitoring. The patch is quite simple - it adds a new integer GUC
>> "checkpoint_update_limit" and every time checkpoint writes this number
>> of
>> buffers, it does two things:
>
> I'd rather not have a new GUC if we can avoid it. What about just
> making this some reasonable value (like 1000 buffers) if log_checkpoints
> = on?

I was thinking about that too, but I think no value can fit all cases
because the systems may have very different I/O subsystems.

With one 7.2k drive I usually get about 25MB/s on average, with big arrays
/ good controllers / fast drives you can write much faster. So either the
value will be too low (and the log will be infested with those messages)
or too high (so it won't be updated very often).

And it depends on the number of shared buffers too. I thought it could be
something like 5% of shared buffers but it really does not solve the
issues. So I think there should be a new GUC.

OTOH, it's probably a good idea to log this only when the log_checkpoints
is enabled as those who tune checkpoints will enable log_checkpoints in
the first case.

Tomas


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-25 23:49:15
Message-ID: eb41e89c2d6e687ebf9eb4c4c78057f2.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 Srpen 2011, 0:39, Tomas Vondra wrote:
> On 26 Srpen 2011, 0:18, Josh Berkus wrote:
>> Tomas,
>>
>>> I'd like to propose a small patch that allows better checkpoint
>>> progress
>>> monitoring. The patch is quite simple - it adds a new integer GUC
>>> "checkpoint_update_limit" and every time checkpoint writes this number
>>> of
>>> buffers, it does two things:
>>
>> I'd rather not have a new GUC if we can avoid it. What about just
>> making this some reasonable value (like 1000 buffers) if log_checkpoints
>> = on?
>
> I was thinking about that too, but I think no value can fit all cases
> because the systems may have very different I/O subsystems.
>
> With one 7.2k drive I usually get about 25MB/s on average, with big arrays
> / good controllers / fast drives you can write much faster. So either the
> value will be too low (and the log will be infested with those messages)
> or too high (so it won't be updated very often).

Hmmm, maybe we could use time instead of number of buffers? Something like
"every 5 seconds, log the checkpoint progress" instead of "every time 1000
buffers is written ..."

That should work on systems regardless of I/O performance.

But although using time instead of number of buffers seems like a good
idea, I don't think it eliminates the need for a new GUC.

(a) Even with time limit, I find it useful to be able to set the limits
differently.

(b) In some cases it may be useful to enable just basic (current
behaviour) checkpoint logging using log_checkpoints, so using it for this
new patch may not be a good idea. Although this could be fixed by allowing
three values no/basic/detailed instead of just true/false.

Tomas


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-26 07:35:05
Message-ID: 4E574CA9.8090809@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 08/25/2011 04:57 PM, Tomas Vondra wrote:
> (b) sends bgwriter stats (so that the buffers_checkpoint is updated)
>

The idea behind only updating the stats in one chunk, at the end, is
that it makes one specific thing easier to do. Let's say you're running
a monitoring system that is grabbing snapshots of pg_stat_bgwriter
periodically. If you want to figure out how much work a checkpoint did,
you only need two points of data to compute that right now. Whenever
you see either of the checkpoint count numbers increase, you just
subtract off the previous sample; now you've got a delta for how many
buffers that checkpoint wrote out. You can derive the information about
the buffer counts involved that appears in the logs quite easily this
way. The intent was to make that possible to do, so that people can
figure this out without needing to parse the log data.

Spreading out the updates defeats that idea. It also makes it possible
to see the buffer writes more in real-time, as they happen. You can
make a case for both approaches having their use cases; the above is
just summarizing the logic behind why it's done the way it is right
now. I don't think many people are actually doing things with this to
the level where their tool will care. The most popular consumer of
pg_stat_bgwriter data I see is Munin graphing changes, and I don't think
it will care either way.

Giving people the option of doing it the other way is a reasonable idea,
but I'm not sure there's enough use case there to justify adding a GUC
just for that. My next goal here is to eliminate checkpoint_segments,
not to add yet another tunable extremely few users would ever touch.

As for throwing more log data out, I'm not sure what new analysis you're
thinking of that it allows. I/O gets increasingly spiky as you zoom in
on it; averaging over a shorter period can easily end up providing less
insight about trends. If anything, I spend more time summarizing the
data that's already there, rather than wanting to break them down. It's
already providing way too much detail for most people. Customers tell
me they don't care to see checkpoint stats unless they're across a day
or more of sampling, so even the current "once every ~5 minutes" is way
more info than they want. I have all this log parsing code and things
that look at pg_stat_bgwriter to collect that data and produce higher
level reports. And lots of it would break if any of this patch is added
and people turn it on. I imagine other log/stat parsing programs might
suffer issues too. That's your other hurdle for change here: the new
analysis techniques have to be useful enough to justify that some
downstream tool disruption is inevitable.

If you have an idea for how to use this extra data for something useful,
let's talk about what that is and see if it's possible to build it in
instead. This problem is harder than it looks, mainly because the way
the OS caches writes here makes trying to derive hard numbers from what
the background writer is doing impossible. When the database writes
things out, and when they actually get written to disk, they are not the
same event. The actual write is often during the sync phase, and not
being able to tracking that beast is where I see the most problems at.
The write phase, the easier part to instrument in the database, that is
pretty boring. That's why the last extra logging I added here focused
on adding visibility to the sync activity instead.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-26 07:54:19
Message-ID: CABUevEyhKJcyTD_Dpg0nx4DBytVJZNbCeVyyKEP_Ac9f9YBthA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Aug 25, 2011 at 22:57, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> Hello,
>
> I'd like to propose a small patch that allows better checkpoint progress
> monitoring. The patch is quite simple - it adds a new integer GUC
> "checkpoint_update_limit" and every time checkpoint writes this number of
> buffers, it does two things:
>
> (a) logs a "checkpoint status" message into the server log, with info
> about total number of buffers to write, number of already written buffers,
> current and average write speed and estimate of remaining time
>
> (b) sends bgwriter stats (so that the buffers_checkpoint is updated)
>
> I believe this will make checkpoint tuning easier, especially with large
> shared bufferers and large when there's other write activity (so that it's
> difficult to see checkpoint I/O).
>
> The default value (0) means this continuous logging is disabled.

This seems like the wrong thing to write to the log. It's really only
useful *during* the checkpoint run, isn't it? If so, I think it should
go in a pg_stat view. In theory, this could be the "progress view" or
"progress field" talked about around Gregs previous patch - or it
could just be modifying the commandstring in pg_stat_activity. Either
way, it should be updated in shared memory in that case (like current
query is), and not sent with a message to the collector.

IMHO, of course ;)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Greg Smith" <greg(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-26 08:46:33
Message-ID: 655fb4e00553434672d2a7be4bdcf78d.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 Srpen 2011, 9:35, Greg Smith wrote:
> On 08/25/2011 04:57 PM, Tomas Vondra wrote:
>> (b) sends bgwriter stats (so that the buffers_checkpoint is updated)
>>
> As for throwing more log data out, I'm not sure what new analysis you're
> thinking of that it allows. I/O gets increasingly spiky as you zoom in
> on it; averaging over a shorter period can easily end up providing less
> insight about trends. If anything, I spend more time summarizing the
> data that's already there, rather than wanting to break them down. It's
> already providing way too much detail for most people. Customers tell
> me they don't care to see checkpoint stats unless they're across a day
> or more of sampling, so even the current "once every ~5 minutes" is way
> more info than they want. I have all this log parsing code and things
> that look at pg_stat_bgwriter to collect that data and produce higher
> level reports. And lots of it would break if any of this patch is added
> and people turn it on. I imagine other log/stat parsing programs might
> suffer issues too. That's your other hurdle for change here: the new
> analysis techniques have to be useful enough to justify that some
> downstream tool disruption is inevitable.

I was aware that by continuously updating pg_stat_bgwriter, the data won't
be synchronized (i.e. the buffers_checkpoint counters will change while
the number of requested/timed checkpoints remain the same).

But does that really break the tools that process the data? When you're
working with summarized data, the result should be more or less the same
as the difference will be smoothed out by averaging etc. You can always
see just one "in progress" checkpoint, so if you get 24 checkpoints a day,
the difference will be 1/24 of a checkpoint. Yes, it's a difference.

A really crazy workaround would be to change checkpoints_requested /
checkpoints_timed to double, and use that to indicate current progress of
the checkpoint. So for example 10.54 would mean 10 checkpoints completed,
one checkpoint in progress, already written 54% of blocks. But yes, that's
a bit crazy.

> If you have an idea for how to use this extra data for something useful,
> let's talk about what that is and see if it's possible to build it in
> instead. This problem is harder than it looks, mainly because the way
> the OS caches writes here makes trying to derive hard numbers from what
> the background writer is doing impossible. When the database writes
> things out, and when they actually get written to disk, they are not the
> same event. The actual write is often during the sync phase, and not
> being able to tracking that beast is where I see the most problems at.
> The write phase, the easier part to instrument in the database, that is
> pretty boring. That's why the last extra logging I added here focused
> on adding visibility to the sync activity instead.

Hmmm, let me explain what led me to this patch - right now I'm doing a
comparison of filesystems with various block sizes (both fs and db
blocks). I've realized that the db block size significantly influences
frequency of checkpoints and amount of data to write, so I'm collecting
data from pg_stat_bgwriter too. The benchmark goes like this

1. collect pg_stat_bgwriter stats
2. run pgbench for 10 minutes
3. collect pg_stat_bgwriter stats (to compute difference with (1))
4. kill the postmaster

The problem is that when checkpoint stats are collected, there might be a
checkpoint in progress and in that case the stats are incomplete. In some
cases (especially with very small db blocks) this has significant impact
because the checkpoints are less frequent.

I can't infer this from other data (e.g. iostat), because that does allow
me what I/O is caused by the checkpoint.

Yes, this does not consider sync timing, but in my case that's not a big
issue (the page cache is rather small so the data are actually forced to
the drive soon).

I could probably live with keeping the current pg_stat_bgwriter logic
(i.e. updating just once) and writing checkpoint status just to the log. I
don't think that should break any current tools that parse logs, because
the message is completely different (prefixed with 'checkpoint status') so
any reasonably written tool should be OK.

Tomas


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-26 17:17:05
Message-ID: 4E57D511.5050408@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 08/26/2011 03:54 AM, Magnus Hagander wrote:
> In theory, this could be the "progress view" or
> "progress field" talked about around Gregs previous patch - or it
> could just be modifying the commandstring in pg_stat_activity.

Right. The whole progress indicator idea is hard to do for queries in
general. But there's enough of these other progress indicator ideas
around now that it may be worth putting a standard way to handle them in
here. It sounds like that would be sufficient to address the area Tomas
is trying to instrument better. I badly want a progress indicator on
CREATE INDEX CONCURRENTLY too, to at least let me know what phase of the
build process it's on. That's turned into a major headache recently.

If we run with the idea of just allowing backends to publish a progress
text string, I think this one maps into a similar space as the
autovacuum one. Publishing how many seconds the operation has been
running for may be reasonable too. Whether the overhead of the timing
calls necessary to compute that will be high or not depends on the
refresh rate of the progress info. My suggestion before was to name
these as key=value pairs for easy parsing; here's three examples now:

autovacumm: pgbench_accounts h=182701 m=301515 d=321345 s=62.231
(cache hits, cache misses, dirty writes, seconds)

background writer: checkpoint b=511 t=3072 s=5.321
(buffers written, total, seconds)

create index concurrently: pgbench_accounts p=1 b=62 t=6213 s=81.232
(phase, blocks processed, total block estimate, seconds)

I think that the idea of making this easily human readable is
optimistic, because it will make all these strings big enough to start
mattering. Given that, we almost have to assume the only consumers of
this data will be able to interpret it using the documentation. I'd be
happy with just the minimal data set in each case, not including any
statistics you can easily derive from the values given (like the MB/s
readings). Adding that figure in particular to more of the log messages
would be nice though.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Magnus Hagander" <magnus(at)hagander(dot)net>
Cc: "Tomas Vondra" <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-26 18:01:39
Message-ID: 8c5bc9b4fa99664a9f3dab618b85309b.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 Srpen 2011, 9:54, Magnus Hagander wrote:
> This seems like the wrong thing to write to the log. It's really only
> useful *during* the checkpoint run, isn't it? If so, I think it should
> go in a pg_stat view. In theory, this could be the "progress view" or
> "progress field" talked about around Gregs previous patch - or it
> could just be modifying the commandstring in pg_stat_activity. Either
> way, it should be updated in shared memory in that case (like current
> query is), and not sent with a message to the collector.

I personally find it handy for example when I need to find out why
performance degraded at a certain point in the past. Now I can see there
was a checkpoint (thanks to log_checkpoints=on), but I don't know any
details about it's progress.

I already collect info from the pg_stat_bgwriter, that's why I thought I
could update it more often. The log file is a natural destination for such
information, IMHO. I see that as an extension to the current checkpoint
messages that are written to the log.

The commandstring approach is probably fine for interactive work, but not
very handy when you need to analyze something that already happened.
Unless you collected the data, of course.

Tomas


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Greg Smith" <greg(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-26 18:07:10
Message-ID: 0e358268f8ae91396b617382377e989d.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 Srpen 2011, 19:17, Greg Smith wrote:
> On 08/26/2011 03:54 AM, Magnus Hagander wrote:
>> In theory, this could be the "progress view" or
>> "progress field" talked about around Gregs previous patch - or it
>> could just be modifying the commandstring in pg_stat_activity.
>
> Right. The whole progress indicator idea is hard to do for queries in
> general. But there's enough of these other progress indicator ideas
> around now that it may be worth putting a standard way to handle them in
> here. It sounds like that would be sufficient to address the area Tomas
> is trying to instrument better. I badly want a progress indicator on
> CREATE INDEX CONCURRENTLY too, to at least let me know what phase of the
> build process it's on. That's turned into a major headache recently.
>
> If we run with the idea of just allowing backends to publish a progress
> text string, I think this one maps into a similar space as the
> autovacuum one. Publishing how many seconds the operation has been
> running for may be reasonable too. Whether the overhead of the timing
> calls necessary to compute that will be high or not depends on the
> refresh rate of the progress info. My suggestion before was to name
> these as key=value pairs for easy parsing; here's three examples now:
>
> autovacumm: pgbench_accounts h=182701 m=301515 d=321345 s=62.231
> (cache hits, cache misses, dirty writes, seconds)
>
> background writer: checkpoint b=511 t=3072 s=5.321
> (buffers written, total, seconds)
>
> create index concurrently: pgbench_accounts p=1 b=62 t=6213 s=81.232
> (phase, blocks processed, total block estimate, seconds)
>
> I think that the idea of making this easily human readable is
> optimistic, because it will make all these strings big enough to start
> mattering. Given that, we almost have to assume the only consumers of
> this data will be able to interpret it using the documentation. I'd be
> happy with just the minimal data set in each case, not including any
> statistics you can easily derive from the values given (like the MB/s
> readings). Adding that figure in particular to more of the log messages
> would be nice though.

I'm a bit confused - are you talking about updating process title or about
writing the info to log? The process title is probably fine for watching
the progress interactively, but it really does not solve what I need.

I need to be able to infer the progress for past events, so I'd have to
sample the 'ps ax' output regularly.

Tomas


From: Noah Misch <noah(at)2ndQuadrant(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-27 04:01:39
Message-ID: 20110827040138.GC29429@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Aug 26, 2011 at 10:46:33AM +0200, Tomas Vondra wrote:
> Hmmm, let me explain what led me to this patch - right now I'm doing a
> comparison of filesystems with various block sizes (both fs and db
> blocks). I've realized that the db block size significantly influences
> frequency of checkpoints and amount of data to write, so I'm collecting
> data from pg_stat_bgwriter too. The benchmark goes like this
>
> 1. collect pg_stat_bgwriter stats
> 2. run pgbench for 10 minutes
> 3. collect pg_stat_bgwriter stats (to compute difference with (1))
> 4. kill the postmaster
>
> The problem is that when checkpoint stats are collected, there might be a
> checkpoint in progress and in that case the stats are incomplete. In some
> cases (especially with very small db blocks) this has significant impact
> because the checkpoints are less frequent.

Could you remove this hazard by adding a step "2a. psql -c CHECKPOINT"?

--
Noah Misch http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)2ndQuadrant(dot)com>
Cc: Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-27 07:39:14
Message-ID: 4E589F22.1020408@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 08/27/2011 12:01 AM, Noah Misch wrote:
> On Fri, Aug 26, 2011 at 10:46:33AM +0200, Tomas Vondra wrote:
>
>> 1. collect pg_stat_bgwriter stats
>> 2. run pgbench for 10 minutes
>> 3. collect pg_stat_bgwriter stats (to compute difference with (1))
>> 4. kill the postmaster
>>
>> The problem is that when checkpoint stats are collected, there might be a
>> checkpoint in progress and in that case the stats are incomplete. In some
>> cases (especially with very small db blocks) this has significant impact
>> because the checkpoints are less frequent.
>>
> Could you remove this hazard by adding a step "2a. psql -c CHECKPOINT"?
>

That's what I do in pgbench-tools, and it helps a lot. It makes it
easier to identify when the checkpoint kicks in if you know it's
approximately the same time after each test run begins, given similar
testing parameters. That said, it's hard to eliminate all of the edge
conditions here.

For example, imagine that you're consuming WAL files such that you hit
checkpoint_segments every 4 minutes. In a 10 minute test run, a
checkpoint will start at 4:00 and finish at around 6:00 (with
checkpoint_completion_target=0.5). The next will start at 8:00 and
should finish at around 10:00--right at the end of when the test ends.
Given the variation that sync timing and rounding issues in the write
phase adds to things, you can expect that some test runs will include
stats from 2 checkpoints, while others will end the test just before the
second one finishes. It does throw the numbers off a bit.

To avoid this when it pops up, I normally aim to push up to where there
are >=4 checkpoints per test run, just so whether I get n or n-1 of them
doesn't impact results as much. But that normally takes doubling the
length of the test to 20 minutes. As it will often take me days of test
time to plow through exploring just a couple of parameters, I'm
sympathetic to Tomas trying to improve accuracy here without having to
run for quite so long. There's few people who have this problem to
worry about though, it's a common issue with benchmarking but not many
other contexts.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Noah Misch" <noah(at)2ndQuadrant(dot)com>
Cc: "Tomas Vondra" <tv(at)fuzzy(dot)cz>, "Greg Smith" <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-27 13:57:16
Message-ID: 240e9c7a3e0deb664908c4bd07a4a01e.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27 Srpen 2011, 6:01, Noah Misch wrote:
> On Fri, Aug 26, 2011 at 10:46:33AM +0200, Tomas Vondra wrote:
>> 1. collect pg_stat_bgwriter stats
>> 2. run pgbench for 10 minutes
>> 3. collect pg_stat_bgwriter stats (to compute difference with (1))
>> 4. kill the postmaster
>>
>> The problem is that when checkpoint stats are collected, there might be
>> a
>> checkpoint in progress and in that case the stats are incomplete. In
>> some
>> cases (especially with very small db blocks) this has significant impact
>> because the checkpoints are less frequent.
>
> Could you remove this hazard by adding a step "2a. psql -c CHECKPOINT"?

I already do that, but it really does not solve the issue. It just aligns
the first expected 'timed' checkpoint, it does not solve the problem with
in-progress checkpoints unless the runs behave exactly the same (and
that's the boring case).

Tomas


From: Noah Misch <noah(at)2ndQuadrant(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-29 05:47:56
Message-ID: 20110829054755.GA9036@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Aug 27, 2011 at 03:57:16PM +0200, Tomas Vondra wrote:
> On 27 Srpen 2011, 6:01, Noah Misch wrote:
> > On Fri, Aug 26, 2011 at 10:46:33AM +0200, Tomas Vondra wrote:
> >> 1. collect pg_stat_bgwriter stats
> >> 2. run pgbench for 10 minutes
> >> 3. collect pg_stat_bgwriter stats (to compute difference with (1))
> >> 4. kill the postmaster
> >>
> >> The problem is that when checkpoint stats are collected, there might be
> >> a
> >> checkpoint in progress and in that case the stats are incomplete. In
> >> some
> >> cases (especially with very small db blocks) this has significant impact
> >> because the checkpoints are less frequent.
> >
> > Could you remove this hazard by adding a step "2a. psql -c CHECKPOINT"?
>
> I already do that, but it really does not solve the issue. It just aligns
> the first expected 'timed' checkpoint, it does not solve the problem with
> in-progress checkpoints unless the runs behave exactly the same (and
> that's the boring case).

To clarify, run that command _after_ the 10-minute pgbench run. It
blocks until completion of both the in-progress checkpoint, if any,
and the requested checkpoint.

--
Noah Misch http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Noah Misch" <noah(at)2ndQuadrant(dot)com>
Cc: "Tomas Vondra" <tv(at)fuzzy(dot)cz>, "Greg Smith" <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-29 11:44:12
Message-ID: dd9a08d0c9bf37f6c2b22b84dcb087cd.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29 Srpen 2011, 7:47, Noah Misch wrote:
> On Sat, Aug 27, 2011 at 03:57:16PM +0200, Tomas Vondra wrote:
>> On 27 Srpen 2011, 6:01, Noah Misch wrote:
>> > Could you remove this hazard by adding a step "2a. psql -c
>> CHECKPOINT"?
>>
>> I already do that, but it really does not solve the issue. It just
>> aligns
>> the first expected 'timed' checkpoint, it does not solve the problem
>> with
>> in-progress checkpoints unless the runs behave exactly the same (and
>> that's the boring case).
>
> To clarify, run that command _after_ the 10-minute pgbench run. It
> blocks until completion of both the in-progress checkpoint, if any,
> and the requested checkpoint.

I don't think this is going to help when there is a checkpoint in
progress, as the new CHECKPOINT will wait till the already running one
completes. It might force it to finish faster, but I still can't say how
much data was written when the pgbench was running.

Tomas


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-01 19:59:56
Message-ID: 14933e7d24533b829c198e0d8c7b5ad3.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I've prepared a significantly simplified version of the patch. The two
main changes are

(a) it does not update the pg_stat_bgwriter anymore, it just prints an
info to the server log

(b) a new GUC is not required, it's driven by the log_checkpoints

This version will log at least 10 'checkpoint status' lines (at 10%, 20%,
30%, ...) and whenever 5 seconds since the last log elapses. The time is
not checked for each buffer but for 128 buffers.

So if the checkpoint is very slow, you'll get a message every 5 seconds,
if it's fast you'll get 10 messages.

Tomas

Attachment Content-Type Size
checkpoint-stats-update-v2.diff text/plain 3.4 KB

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 07:47:26
Message-ID: CABUevEzHRNo2=BM=fP5=gvC-qN3ZXaf6c+LQ-4gAUWsorNTRAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 1, 2011 at 21:59, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> I've prepared a significantly simplified version of the patch. The two
> main changes are
>
> (a) it does not update the pg_stat_bgwriter anymore, it just prints an
> info to the server log
> (b) a new GUC is not required, it's driven by the log_checkpoints

The comment still refers to the checkpoint_update_limit.

> This version will log at least 10 'checkpoint status' lines (at 10%, 20%,
> 30%, ...) and whenever 5 seconds since the last log elapses. The time is
> not checked for each buffer but for 128 buffers.
>
> So if the checkpoint is very slow, you'll get a message every 5 seconds,
> if it's fast you'll get 10 messages.

I would personally find this very annoying. If I read it correctly,
anybody with a database with no problem at all but that has
log_checkpoints on suddenly got at least 10 times as many messages? I
generally try to advise my clients to *not* log excessively because
then they will end up not bothering to read the logs...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Magnus Hagander" <magnus(at)hagander(dot)net>
Cc: "Tomas Vondra" <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 09:01:00
Message-ID: 477bc1196f185fab7ee5f91b22a24d53.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2 Září 2011, 9:47, Magnus Hagander wrote:
> On Thu, Sep 1, 2011 at 21:59, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>> I've prepared a significantly simplified version of the patch. The two
>> main changes are
>>
>> (a) it does not update the pg_stat_bgwriter anymore, it just prints an
>> info to the server log
>> (b) a new GUC is not required, it's driven by the log_checkpoints
>
> The comment still refers to the checkpoint_update_limit.

OK, I'll fix that.

>> This version will log at least 10 'checkpoint status' lines (at 10%,
>> 20%,
>> 30%, ...) and whenever 5 seconds since the last log elapses. The time is
>> not checked for each buffer but for 128 buffers.
>>
>> So if the checkpoint is very slow, you'll get a message every 5 seconds,
>> if it's fast you'll get 10 messages.
>
> I would personally find this very annoying. If I read it correctly,
> anybody with a database with no problem at all but that has
> log_checkpoints on suddenly got at least 10 times as many messages? I
> generally try to advise my clients to *not* log excessively because
> then they will end up not bothering to read the logs...

What about logging it with a lower level, e.g. NOTICE instead of the
current LOG? If that's not a solution then a new GUC is needed I guess.

Tomas


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 10:45:34
Message-ID: CABUevEzg46c2d1br+pv_+zQKFFC76sr1LKbQz0xCy2HD9xx0MQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Sep 2, 2011 at 11:01, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> On 2 Září 2011, 9:47, Magnus Hagander wrote:
>> On Thu, Sep 1, 2011 at 21:59, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>>> I've prepared a significantly simplified version of the patch. The two
>>> main changes are
>>>
>>> (a) it does not update the pg_stat_bgwriter anymore, it just prints an
>>> info to the server log
>>> (b) a new GUC is not required, it's driven by the log_checkpoints
>>
>> The comment still refers to the checkpoint_update_limit.
>
> OK, I'll fix that.
>
>>> This version will log at least 10 'checkpoint status' lines (at 10%,
>>> 20%,
>>> 30%, ...) and whenever 5 seconds since the last log elapses. The time is
>>> not checked for each buffer but for 128 buffers.
>>>
>>> So if the checkpoint is very slow, you'll get a message every 5 seconds,
>>> if it's fast you'll get 10 messages.
>>
>> I would personally find this very annoying. If I read it correctly,
>> anybody with a database with no problem at all but that has
>> log_checkpoints on suddenly got at least 10 times as many messages? I
>> generally try to advise my clients to *not* log excessively because
>> then they will end up not bothering to read the logs...
>
> What about logging it with a lower level, e.g. NOTICE instead of the
> current LOG? If that's not a solution then a new GUC is needed I guess.

I guess if it's at a DEBUG level it won't annoy anybody who doesn't
need it. Not sure if NOTICE is low enough..

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Magnus Hagander" <magnus(at)hagander(dot)net>
Cc: "Tomas Vondra" <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 13:25:09
Message-ID: 801b970ec2b13e81fbd073dfc4252464.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2 Září 2011, 12:45, Magnus Hagander wrote:
> On Fri, Sep 2, 2011 at 11:01, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>> What about logging it with a lower level, e.g. NOTICE instead of the
>> current LOG? If that's not a solution then a new GUC is needed I guess.
>
> I guess if it's at a DEBUG level it won't annoy anybody who doesn't
> need it. Not sure if NOTICE is low enough..

I've changed the level to NOTICE. I guess I could put that to info, but
the debug levels seem too low to me. Is there a rule of a thumb about
where to put messages?

Tomas

Attachment Content-Type Size
checkpoint-stats-update-v3.diff text/plain 3.4 KB

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 13:44:42
Message-ID: 1314971082.29051.0.camel@fsopti579.F-Secure.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On fre, 2011-09-02 at 11:01 +0200, Tomas Vondra wrote:
> What about logging it with a lower level, e.g. NOTICE instead of the
> current LOG? If that's not a solution then a new GUC is needed I
> guess.

Changing the log level is not the appropriate solution. Make it a
configuration parameter.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tomas Vondra <tv(at)fuzzy(dot)cz>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 14:13:19
Message-ID: 15092.1314972799@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> On fre, 2011-09-02 at 11:01 +0200, Tomas Vondra wrote:
>> What about logging it with a lower level, e.g. NOTICE instead of the
>> current LOG? If that's not a solution then a new GUC is needed I
>> guess.

> Changing the log level is not the appropriate solution. Make it a
> configuration parameter.

Frankly, logging as verbose as this is proposed to be is something
that I can't imagine anybody wanting at all, especially not in
production environments. DEBUG3 or somewhere around there seems fine.

regards, tom lane


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Peter Eisentraut" <peter_e(at)gmx(dot)net>
Cc: "Tomas Vondra" <tv(at)fuzzy(dot)cz>, "Magnus Hagander" <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 15:02:04
Message-ID: 43658c684e9388daf2f8481899571e54.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2 Září 2011, 15:44, Peter Eisentraut wrote:
> On fre, 2011-09-02 at 11:01 +0200, Tomas Vondra wrote:
>> What about logging it with a lower level, e.g. NOTICE instead of the
>> current LOG? If that's not a solution then a new GUC is needed I
>> guess.
>
> Changing the log level is not the appropriate solution. Make it a
> configuration parameter.

Why is it inappropriate solution? There's a log_checkpoints GUC that
drives it and you can either get basic info (summary of the checkpoint) or
detailed log (with a lower log level).

In the first patch I've proposed a new GUC (used to set how often the info
should be logged or disable it), but Josh Berkus pointed out that I should
get rid of it if I can. Which is what I've done in the following patches.

Tomas


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers(at)postgresql(dot)org, Peter Eisentraut <peter_e(at)gmx(dot)net>
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 15:08:45
Message-ID: CABUevEzNZmg6VHerp1=jDP5OY=XAtWGthHvvNKeEB49H8EaAkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sep 2, 2011 5:02 PM, "Tomas Vondra" <tv(at)fuzzy(dot)cz> wrote:
>
> On 2 Září 2011, 15:44, Peter Eisentraut wrote:
> > On fre, 2011-09-02 at 11:01 +0200, Tomas Vondra wrote:
> >> What about logging it with a lower level, e.g. NOTICE instead of the
> >> current LOG? If that's not a solution then a new GUC is needed I
> >> guess.
> >
> > Changing the log level is not the appropriate solution. Make it a
> > configuration parameter.
>
> Why is it inappropriate solution? There's a log_checkpoints GUC that
> drives it and you can either get basic info (summary of the checkpoint) or
> detailed log (with a lower log level).
>
> In the first patch I've proposed a new GUC (used to set how often the info
> should be logged or disable it), but Josh Berkus pointed out that I should
> get rid of it if I can. Which is what I've done in the following patches.

Well, josh doesn't speak for everybody ;-)

Maybe one way could be to change log_checkpoints into an enum of "off, on,
debug "(values open for bikeshedding of course)

/Magnus


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Tomas Vondra" <tv(at)fuzzy(dot)cz>, "Magnus Hagander" <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 15:10:12
Message-ID: 93225ed61ca62fa906649233eb88ce75.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2 Září 2011, 16:13, Tom Lane wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>> On fre, 2011-09-02 at 11:01 +0200, Tomas Vondra wrote:
>>> What about logging it with a lower level, e.g. NOTICE instead of the
>>> current LOG? If that's not a solution then a new GUC is needed I
>>> guess.
>
>> Changing the log level is not the appropriate solution. Make it a
>> configuration parameter.
>
> Frankly, logging as verbose as this is proposed to be is something
> that I can't imagine anybody wanting at all, especially not in
> production environments. DEBUG3 or somewhere around there seems fine.

Yes, it can be a bit verbose, but I don't think it's that verbose. I'm one
of those this information about checkpoint progress may be a valuable info
- even on production systems - that's why I don't want to put that into
the debug levels.

It should write about 10 lines for each checkpoint (altough I've just
realized the current patch won't do that for the timed checkpoints - it'll
probably log significantly more).

My 'ideal' solution would be either to add another GUC (to turn this
on/off) or allow log_checkpoints to have three values

log_checkpoints = {off, normal, detailed}

where 'normal' provides the current output and 'detail' produces this much
verbose output.

Tomas


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Magnus Hagander" <magnus(at)hagander(dot)net>
Cc: "Tomas Vondra" <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org, "Peter Eisentraut" <peter_e(at)gmx(dot)net>
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 15:13:03
Message-ID: 0159f3b80e2a5a78901c15645dd32a63.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2 Září 2011, 17:08, Magnus Hagander wrote:
> On Sep 2, 2011 5:02 PM, "Tomas Vondra" <tv(at)fuzzy(dot)cz> wrote:
>>
>> On 2 Září 2011, 15:44, Peter Eisentraut wrote:
>> > On fre, 2011-09-02 at 11:01 +0200, Tomas Vondra wrote:
>> >> What about logging it with a lower level, e.g. NOTICE instead of the
>> >> current LOG? If that's not a solution then a new GUC is needed I
>> >> guess.
>> >
>> > Changing the log level is not the appropriate solution. Make it a
>> > configuration parameter.
>>
>> Why is it inappropriate solution? There's a log_checkpoints GUC that
>> drives it and you can either get basic info (summary of the checkpoint)
>> or
>> detailed log (with a lower log level).
>>
>> In the first patch I've proposed a new GUC (used to set how often the
>> info
>> should be logged or disable it), but Josh Berkus pointed out that I
>> should
>> get rid of it if I can. Which is what I've done in the following
>> patches.
>
> Well, josh doesn't speak for everybody ;-)

Sure, but I think the effort not to have a zillion of GUC makes sense.

> Maybe one way could be to change log_checkpoints into an enum of "off, on,
> debug "(values open for bikeshedding of course)

Yes, that's actually one of the solutions I'd prefer. Not sure why I
rejected it ...

Tomas


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tomas Vondra" <tv(at)fuzzy(dot)cz>,"Peter Eisentraut" <peter_e(at)gmx(dot)net>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 15:14:36
Message-ID: 4E60AC8C0200002500040B2D@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

."Tomas Vondra" <tv(at)fuzzy(dot)cz> wrote:
> On 2 Zá*í 2011, 15:44, Peter Eisentraut wrote:

>> Changing the log level is not the appropriate solution. Make it
>> a configuration parameter.
>
> Why is it inappropriate solution? There's a log_checkpoints GUC
> that drives it and you can either get basic info (summary of the
> checkpoint) or detailed log (with a lower log level).

Maybe?:

log_checkpoints = 'verbose'

It seems to me that one of the most likely uses of this is for
investigating performance issues in production. Like with
log_statement, you might want to turn it on briefly and examine
results.

-Kevin


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 15:18:31
Message-ID: CAJKUy5jN+epCPCxZ5p+0giN4ux81Nm6AG3UW94pth04FapUYQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Sep 2, 2011 at 10:10 AM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>
> My 'ideal' solution would be either to add another GUC (to turn this
> on/off) or allow log_checkpoints to have three values
>
> log_checkpoints = {off, normal, detailed}
>

+1 on make it an enum but i prefer values {on, off, verbose}

where "on" will log just the same as now

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Peter Eisentraut <peter_e(at)gmx(dot)net>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 15:23:13
Message-ID: CABUevEzAe1kw3RQjDyMR_kpODt5=YV+kmAis3LJxcyM9cRA_QA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sep 2, 2011 5:18 PM, "Jaime Casanova" <jaime(at)2ndquadrant(dot)com> wrote:
>
> On Fri, Sep 2, 2011 at 10:10 AM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> >
> > My 'ideal' solution would be either to add another GUC (to turn this
> > on/off) or allow log_checkpoints to have three values
> >
> > log_checkpoints = {off, normal, detailed}
> >
>
> +1 on make it an enum but i prefer values {on, off, verbose}
>
> where "on" will log just the same as now

Yeah, keeping on and off working the same way as they do today is a feature
on itself.

/Magnus


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 17:09:42
Message-ID: CA+TgmoY4jJzyioxYR8t9TzEwrk3B76rgZmWeqFj+tNxoMjcv3w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 1, 2011 at 3:59 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> I've prepared a significantly simplified version of the patch. The two
> main changes are
>
> (a) it does not update the pg_stat_bgwriter anymore, it just prints an
> info to the server log
>
> (b) a new GUC is not required, it's driven by the log_checkpoints
>
> This version will log at least 10 'checkpoint status' lines (at 10%, 20%,
> 30%, ...) and whenever 5 seconds since the last log elapses. The time is
> not checked for each buffer but for 128 buffers.
>
> So if the checkpoint is very slow, you'll get a message every 5 seconds,
> if it's fast you'll get 10 messages.

This seems like a strange heuristic. I understand the value of
emitting a progress report every N seconds, but why would you want a
report at 10%, 20%, 30% even if it hasn't been 5 seconds yet? I don't
need ten progress messages on a checkpoint that only takes three
seconds to complete.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 18:43:16
Message-ID: 1314988997.11695.1.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On fre, 2011-09-02 at 17:02 +0200, Tomas Vondra wrote:
> On 2 Září 2011, 15:44, Peter Eisentraut wrote:
> > On fre, 2011-09-02 at 11:01 +0200, Tomas Vondra wrote:
> >> What about logging it with a lower level, e.g. NOTICE instead of the
> >> current LOG? If that's not a solution then a new GUC is needed I
> >> guess.
> >
> > Changing the log level is not the appropriate solution. Make it a
> > configuration parameter.
>
> Why is it inappropriate solution? There's a log_checkpoints GUC that
> drives it and you can either get basic info (summary of the checkpoint) or
> detailed log (with a lower log level).

If a user is forced to change the log level to get at one particular
piece of information, they will then also turn on countless other log
events on that level, which is annoying.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 18:46:27
Message-ID: 1314989188.11695.4.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On fre, 2011-09-02 at 17:13 +0200, Tomas Vondra wrote:
> Sure, but I think the effort not to have a zillion of GUC makes sense.

Well, I'll be the first to agree that reducing complexity in
configuration and tuning settings is worth aiming for.

But for logging, I'd rather have more settings, theoretically up to one
for each possible message. That doesn't increase complexity, as long as
it has linear behavior. It's debatable whether that means a new
log_something parameter for each situation, or just a single parameter
containing some kind of list, or something else, but that's a different
problem.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tomas Vondra <tv(at)fuzzy(dot)cz>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 18:48:24
Message-ID: 22522.1314989304@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> On fre, 2011-09-02 at 17:02 +0200, Tomas Vondra wrote:
>> Why is it inappropriate solution? There's a log_checkpoints GUC that
>> drives it and you can either get basic info (summary of the checkpoint) or
>> detailed log (with a lower log level).

> If a user is forced to change the log level to get at one particular
> piece of information, they will then also turn on countless other log
> events on that level, which is annoying.

Yeah, if we're going to have this at all, some form of GUC control over
it seems necessary. I'm still disturbed by the verbosity of the
proposed output though. Couldn't we collapse the information into a
single log entry per checkpoint cycle? Perhaps that would be sufficient
to just let the log_checkpoints setting be used as-is.

regards, tom lane


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: "Tomas Vondra" <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 18:54:05
Message-ID: 5f80ebfc28225c4ae51e1577bdb6264d.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2 Září 2011, 19:09, Robert Haas wrote:
> On Thu, Sep 1, 2011 at 3:59 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>> I've prepared a significantly simplified version of the patch. The two
>> main changes are
>>
>> (a) it does not update the pg_stat_bgwriter anymore, it just prints an
>> info to the server log
>>
>> (b) a new GUC is not required, it's driven by the log_checkpoints
>>
>> This version will log at least 10 'checkpoint status' lines (at 10%,
>> 20%,
>> 30%, ...) and whenever 5 seconds since the last log elapses. The time is
>> not checked for each buffer but for 128 buffers.
>>
>> So if the checkpoint is very slow, you'll get a message every 5 seconds,
>> if it's fast you'll get 10 messages.
>
> This seems like a strange heuristic. I understand the value of
> emitting a progress report every N seconds, but why would you want a
> report at 10%, 20%, 30% even if it hasn't been 5 seconds yet? I don't
> need ten progress messages on a checkpoint that only takes three
> seconds to complete.

Not really, but I admit that the current heuristics is a bit weird.

I do need to get info about progress even for timed checkpoints (that's
why I'm logging after each 10%). OTOH I've just realized that with the
current implementation it'll log each 5 seconds, so for a 5-minute timed
checkpoint you'll get about 60 messages. That probably is too much,
especially if you're using completion target near 1.0 (which means there's
always a timed checkpoint running).

So I think a sane heuristics would be:

(1) for a timed checpoint, log each 10%
(2) for other checkpoint types, log after 5 seconds

Tomas


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Peter Eisentraut" <peter_e(at)gmx(dot)net>
Cc: "Tomas Vondra" <tv(at)fuzzy(dot)cz>, "Magnus Hagander" <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 19:00:55
Message-ID: 741b550a7543d7fc377b1981356251d1.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2 Září 2011, 20:46, Peter Eisentraut wrote:
> On fre, 2011-09-02 at 17:13 +0200, Tomas Vondra wrote:
>> Sure, but I think the effort not to have a zillion of GUC makes sense.
>
> Well, I'll be the first to agree that reducing complexity in
> configuration and tuning settings is worth aiming for.
>
> But for logging, I'd rather have more settings, theoretically up to one
> for each possible message. That doesn't increase complexity, as long as
> it has linear behavior. It's debatable whether that means a new
> log_something parameter for each situation, or just a single parameter
> containing some kind of list, or something else, but that's a different
> problem.

OK, good point. The effort to make this work without a GUC obviously is a
dead end, and the idea to turn log_checkpoints into an enum seems
reasonable.

Tomas


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Tomas Vondra" <tv(at)fuzzy(dot)cz>, "Magnus Hagander" <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 19:09:52
Message-ID: 4ec7384cc913972e398e8cbe9c77c5b3.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2 Září 2011, 20:48, Tom Lane wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>> On fre, 2011-09-02 at 17:02 +0200, Tomas Vondra wrote:
>>> Why is it inappropriate solution? There's a log_checkpoints GUC that
>>> drives it and you can either get basic info (summary of the checkpoint)
>>> or
>>> detailed log (with a lower log level).
>
>> If a user is forced to change the log level to get at one particular
>> piece of information, they will then also turn on countless other log
>> events on that level, which is annoying.
>
> Yeah, if we're going to have this at all, some form of GUC control over
> it seems necessary. I'm still disturbed by the verbosity of the
> proposed output though. Couldn't we collapse the information into a
> single log entry per checkpoint cycle? Perhaps that would be sufficient
> to just let the log_checkpoints setting be used as-is.

Yes, the GUC seems inevitable. I'm still working on the verbosity for
different checpoint typees, but I think 5 seconds for xlog and 10% for
timed checkpoint is a reasonable setting.

I'm not sure what you mean by collapsing the info into a single log entry?
That would mean I'd have to wait till the checkpoint completes, and one of
the reasons for this patch was to get info about progress while the
checkpoint is running.

For example I'd like to have this information in cases when the checkpoint
never finishes - for example when performing automated benchmarks. I
sometimes just kill the database (I need to rebuild it from scratch for
the next run) so I don't get any checkpoint message at all. Waiting for
the checkpoint would significantly increase the time for each run and thus
for the whole benchmark. With one run that probably is not a problem, with
360 runs each minute makes a big difference.

Tomas


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
Cc: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Magnus Hagander" <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 19:23:07
Message-ID: 23245.1314991387@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Tomas Vondra" <tv(at)fuzzy(dot)cz> writes:
> On 2 Z 2011, 20:48, Tom Lane wrote:
>> Yeah, if we're going to have this at all, some form of GUC control over
>> it seems necessary. I'm still disturbed by the verbosity of the
>> proposed output though. Couldn't we collapse the information into a
>> single log entry per checkpoint cycle? Perhaps that would be sufficient
>> to just let the log_checkpoints setting be used as-is.

> I'm not sure what you mean by collapsing the info into a single log entry?
> That would mean I'd have to wait till the checkpoint completes, and one of
> the reasons for this patch was to get info about progress while the
> checkpoint is running.

Well, to be blunt, putting stuff into the postmaster log is entirely the
wrong way to satify a requirement like that. If you want to expose
progress information, it should be exposed via something dynamic like
pg_stat_activity. What could be useful to log is statistics that people
might want to aggregate later, and I don't immediately see a reason why
such stats couldn't be logged just once at end of each checkpoint cycle.

regards, tom lane


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Tomas Vondra" <tv(at)fuzzy(dot)cz>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Magnus Hagander" <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 19:52:37
Message-ID: 4113ec48f1ad7aaf8d89864f17f30384.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2 Září 2011, 21:23, Tom Lane wrote:
> "Tomas Vondra" <tv(at)fuzzy(dot)cz> writes:
>> On 2 Z?????? 2011, 20:48, Tom Lane wrote:
>>> Yeah, if we're going to have this at all, some form of GUC control over
>>> it seems necessary. I'm still disturbed by the verbosity of the
>>> proposed output though. Couldn't we collapse the information into a
>>> single log entry per checkpoint cycle? Perhaps that would be
>>> sufficient
>>> to just let the log_checkpoints setting be used as-is.
>
>> I'm not sure what you mean by collapsing the info into a single log
>> entry?
>> That would mean I'd have to wait till the checkpoint completes, and one
>> of
>> the reasons for this patch was to get info about progress while the
>> checkpoint is running.
>
> Well, to be blunt, putting stuff into the postmaster log is entirely the
> wrong way to satify a requirement like that. If you want to expose
> progress information, it should be exposed via something dynamic like
> pg_stat_activity. What could be useful to log is statistics that people
> might want to aggregate later, and I don't immediately see a reason why
> such stats couldn't be logged just once at end of each checkpoint cycle.

The problem with pg_stat_activity is that it provides just 'current
state', no history. If you don't sample that often enough, you may
completely miss the checkpoint (and thus you'll have no info about it,
unless you enable log_checkpoints and check the log). And it's imposible
to use if you need info about something that happened in the past. And
AFAIK it does not show processes running timed checkpoints for example.

So IMHO it's unusable for what I'm trying to achieve.

Tomas


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
Cc: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Magnus Hagander" <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-02 20:04:48
Message-ID: 24060.1314993888@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Tomas Vondra" <tv(at)fuzzy(dot)cz> writes:
> On 2 Z 2011, 21:23, Tom Lane wrote:
>> Well, to be blunt, putting stuff into the postmaster log is entirely the
>> wrong way to satify a requirement like that. If you want to expose
>> progress information, it should be exposed via something dynamic like
>> pg_stat_activity. What could be useful to log is statistics that people
>> might want to aggregate later, and I don't immediately see a reason why
>> such stats couldn't be logged just once at end of each checkpoint cycle.

> The problem with pg_stat_activity is that it provides just 'current
> state', no history. If you don't sample that often enough, you may
> completely miss the checkpoint (and thus you'll have no info about it,
> unless you enable log_checkpoints and check the log). And it's imposible
> to use if you need info about something that happened in the past. And
> AFAIK it does not show processes running timed checkpoints for example.

Your requirements seem sufficiently slippery that I don't think you've
thought them through very well. As far as I can see, the proposed patch
will mostly result in bloating the postmaster log with repetitive
information of next to no value. I can see the reason for wanting to
know what the system is doing right now, and I can see the reason for
wanting aggregatable statistics so that you can tell over time whether
your settings need to be adjusted. I don't see the value in a lot of
"10% done" log entries --- there is essentially no historical value in
such, IMO, because they don't correspond to any user-level activity.
(Which is what distinguishes this from, say, log_connections or
log_statements.) The fact that you can't figure out a reasonable
frequency for making the entries is a symptom of the design being wrong
at its core.

regards, tom lane


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-03 06:19:14
Message-ID: 4E61C6E2.7050503@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/02/2011 11:10 AM, Tomas Vondra wrote:
> My 'ideal' solution would be either to add another GUC (to turn this
> on/off) or allow log_checkpoints to have three values
>
> log_checkpoints = {off, normal, detailed}
>
> where 'normal' provides the current output and 'detail' produces this much
> verbose output.
>

If this is going to be acceptable, that's likely the only path it could
happen by and still meet what you're looking for. I will just again
stress that the part you're working on instrumenting better right now is
not actually where larger systems really run into the most problems
here, based on what I've seen. I added a series of log messages to 9.1
at DEBUG1, aimed at tracking the sync phase. That's where I see many
more checkpoint issues than in the write one. On Linux in particular,
it's almost impossible for the write phase to be more of a problem than
the sync one.

So the logging you're adding here I don't ever expect to turn on. But I
wouldn't argue against an option to handle the logging use-case you're
concerned about. Letting people observe for themselves and decide which
of the phases is more interesting to their workload seems appropriate.
Then users have options for what to log, no matter which type of problem
they run into.

If you're expanding log_checkpoints to an enum, for that to handle what
I think everybody might ever want (for what checkpoints do now at
least), I'd find that more useful if it happened like this instead:

log_checkpoints = {off, on, write, sync, verbose}

I don't think you should change the semantics of off/on, which will
avoid breaking existing postgresql.conf files and resources that suggest
tuning advice. "write" can toggle on what you're adding; "sync" should
control whether the DEBUG1 messages showing the individual file names in
the sync phase appear; and "verbose" can include both.

As far as a heuristic for making this less chatty when there's nothing
exciting happening goes, I think something based on how much time has
passed would be the best one. In your use case, I would guess you don't
really care whether a message appears every n%. If I understand you
correctly now, you would mainly care about getting enough log detail to
know 1) when things are running really slow, or b) often enough that the
margin of error in your benchmark results from unaccounted checkpoint
writes is acceptable. In both of those cases, I'd think a time-based
threshold would be appropriate, and that also deals with the time-based
checkpoints, too.

If your logging criteria for the write phase was "display a message any
time more than 30 seconds have passed since last seeing one", that would
give you only a few lines of output in a boring, normal
checkpoint--certainly less than the 9 in-progress samples you're
outputting now, at 10% intervals. But in the pathological situations
where writes are super slow, your log data would become correspondingly
denser, which is exactly what you want in that situation.

I think combining the two makes the most sense: "log when >=30 seconds
have passed since the last message, and there's been >=10% more progress
made". (Maybe do the progress check before the time one, to cut down on
gettimeofday() calls) That would give you 4 in-progress reports during
a standard 2.5 minute write phase, and in cases where the checkpoints
are taking a long time you'd get as many as 9. That's pretty close to
auto-tuning the amount of log output, so the amount of it is roughly
proportional to how likely the information it's logging will be
interesting.

We certainly don't want to add yet another GUC just to control the
frequency. I don't think it will be too hard to put two hard-coded
thresholds in and do good enough for just about everyone though. I
would probably prefer setting those thresholds to 60 seconds/20%
instead. That might not be detailed enough for you though.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Tomas Vondra" <tv(at)fuzzy(dot)cz>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Magnus Hagander" <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-05 23:07:12
Message-ID: 211db043ef5a961939688099e99ff72b.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2 Září 2011, 22:04, Tom Lane wrote:
> "Tomas Vondra" <tv(at)fuzzy(dot)cz> writes:
>> On 2 Z?????? 2011, 21:23, Tom Lane wrote:
>>> Well, to be blunt, putting stuff into the postmaster log is entirely
>>> the
>>> wrong way to satify a requirement like that. If you want to expose
>>> progress information, it should be exposed via something dynamic like
>>> pg_stat_activity. What could be useful to log is statistics that
>>> people
>>> might want to aggregate later, and I don't immediately see a reason why
>>> such stats couldn't be logged just once at end of each checkpoint
>>> cycle.
>
>> The problem with pg_stat_activity is that it provides just 'current
>> state', no history. If you don't sample that often enough, you may
>> completely miss the checkpoint (and thus you'll have no info about it,
>> unless you enable log_checkpoints and check the log). And it's imposible
>> to use if you need info about something that happened in the past. And
>> AFAIK it does not show processes running timed checkpoints for example.
>
> Your requirements seem sufficiently slippery that I don't think you've
> thought them through very well. As far as I can see, the proposed patch
> will mostly result in bloating the postmaster log with repetitive
> information of next to no value. I can see the reason for wanting to
> know what the system is doing right now, and I can see the reason for
> wanting aggregatable statistics so that you can tell over time whether
> your settings need to be adjusted. I don't see the value in a lot of
> "10% done" log entries --- there is essentially no historical value in
> such, IMO, because they don't correspond to any user-level activity.
> (Which is what distinguishes this from, say, log_connections or
> log_statements.) The fact that you can't figure out a reasonable
> frequency for making the entries is a symptom of the design being wrong
> at its core.

No, I probably haven't thought through all the requirements - but that's
exactly the feedback I expect from a commit fest.

Originally I've built the patch to fix a single use-case, where the 'log
each 10%' approach made perfect sense. I'm the first one to admin this is
not a good approach for the other use-cases and that the frequency should
be figured out in a different way.

Anyway I don't think that a piece of information that does not correspond
to any user-level activity is useless. Checkpoints are often a very
intrusively and significantly influence the users - that's why I believe
any help to tune them is desirable. The ability to see what the system is
doing right now and aggregatable statistics are fine, but in many cases
they're useless (e.g. when investigating past events).

Tomas


From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Greg Smith" <greg(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-05 23:52:52
Message-ID: e6328725ee5af0897c640eb3529a1a18.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3 Září 2011, 8:19, Greg Smith wrote:
> If you're expanding log_checkpoints to an enum, for that to handle what
> I think everybody might ever want (for what checkpoints do now at
> least), I'd find that more useful if it happened like this instead:
>
> log_checkpoints = {off, on, write, sync, verbose}
>
> I don't think you should change the semantics of off/on, which will
> avoid breaking existing postgresql.conf files and resources that suggest
> tuning advice. "write" can toggle on what you're adding; "sync" should
> control whether the DEBUG1 messages showing the individual file names in
> the sync phase appear; and "verbose" can include both.

Thanks, those are definitely good ideas extending the original patch and
making it much more useful I guess.

> As far as a heuristic for making this less chatty when there's nothing
> exciting happening goes, I think something based on how much time has
> passed would be the best one. In your use case, I would guess you don't
> really care whether a message appears every n%. If I understand you
> correctly now, you would mainly care about getting enough log detail to
> know 1) when things are running really slow, or b) often enough that the
> margin of error in your benchmark results from unaccounted checkpoint
> writes is acceptable. In both of those cases, I'd think a time-based
> threshold would be appropriate, and that also deals with the time-based
> checkpoints, too.

Yes, the time-based threshold seems like the right solution.

> If your logging criteria for the write phase was "display a message any
> time more than 30 seconds have passed since last seeing one", that would
> give you only a few lines of output in a boring, normal
> checkpoint--certainly less than the 9 in-progress samples you're
> outputting now, at 10% intervals. But in the pathological situations
> where writes are super slow, your log data would become correspondingly
> denser, which is exactly what you want in that situation.

I still am not sure what should be a reasonable value or how to determine
it. What happens when the checkpoint_timeout is increased, there's more
shared_buffers etc.? What about using (checkpoint_timeout/10) for the
time-based checkpoints and 30s for the other checkpoints?

> I think combining the two makes the most sense: "log when >=30 seconds
> have passed since the last message, and there's been >=10% more progress
> made". (Maybe do the progress check before the time one, to cut down on

Is this is a good idea? The case when the timeout expires and not much
data was written is interesting, and this would not log it. But OTOH this
would nicely solve the issue with time-based checkpoints and a fixed
threshold.

Tomas


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-06 02:58:53
Message-ID: CA+TgmoZ_Peg9xmtQiOLxf4wYApsMY5tjo-2GmuwKika2Shs2hw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Sep 5, 2011 at 7:52 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>> If your logging criteria for the write phase was "display a message any
>> time more than 30 seconds have passed since last seeing one", that would
>> give you only a few lines of output in a boring, normal
>> checkpoint--certainly less than the 9 in-progress samples you're
>> outputting now, at 10% intervals.  But in the pathological situations
>> where writes are super slow, your log data would become correspondingly
>> denser, which is exactly what you want in that situation.
>
> I still am not sure what should be a reasonable value or how to determine
> it. What happens when the checkpoint_timeout is increased, there's more
> shared_buffers etc.? What about using (checkpoint_timeout/10) for the
> time-based checkpoints and 30s for the other checkpoints?

I think the idea here is that we only need to log a message often
enough that the admin who is sitting there watching this won't get too
impatient waiting for the next one. As that's not a function of
checkpoint_timeout, I don't see much value in conditioning this on
that. +1 for the suggestion of 30s intervals - that seems infrequent
enough not to result in too much log spam, but sufficiently frequent
that anyone who is concerned about checkpoint progress won't have to
wait terribly long to find out how things are going.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-19 14:55:23
Message-ID: 4E7757DB.3000601@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/05/2011 07:52 PM, Tomas Vondra wrote:
>> If your logging criteria for the write phase was "display a message any
>> time more than 30 seconds have passed since last seeing one", that would
>> give you only a few lines of output in a boring, normal
>> checkpoint--certainly less than the 9 in-progress samples you're
>> outputting now, at 10% intervals. But in the pathological situations
>> where writes are super slow, your log data would become correspondingly
>> denser, which is exactly what you want in that situation.
>>
> I still am not sure what should be a reasonable value or how to determine
> it. What happens when the checkpoint_timeout is increased, there's more
> shared_buffers etc.? What about using (checkpoint_timeout/10) for the
> time-based checkpoints and 30s for the other checkpoints?
>

That may work fine. Maybe implement it like that, and see if the amount
of logging detail is reasonable in a couple of test scenarios.

>> I think combining the two makes the most sense: "log when>=30 seconds
>> have passed since the last message, and there's been>=10% more progress
>> made". (Maybe do the progress check before the time one, to cut down on
>>
> Is this is a good idea? The case when the timeout expires and not much
> data was written is interesting, and this would not log it. But OTOH this
> would nicely solve the issue with time-based checkpoints and a fixed
> threshold.
>

One thing I am trying to avoid here is needing to check the system clock
after every buffer write. I also consider it useful to put an upper
bound on how many of these messages will appear even in the verbose
mode. This deals with both those problems.

Yes, there is a potential problem with this idea. Let's say checkpoint
writes degrade to where they take an hour. In that case, you won't see
the first progress report until 6 minutes (10%) have gone by with this
implementation. I don't see a good way to resolve that without
violating one of the other priorities I listed above though. You'll
have to poll the system clock constantly and will end up creating a lot
of log entries if you don't do a check against the % progress first.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-19 15:19:28
Message-ID: 17352.1316445568@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith <greg(at)2ndQuadrant(dot)com> writes:
> One thing I am trying to avoid here is needing to check the system clock
> after every buffer write.

On machines where gettimeofday is slow (and last I heard there were
still lots of them), any such thing would be a disaster
performance-wise. I'm still afraid to add more gettimeofday's into the
query parse/plan/execute code path, even though it would greatly ease
the problem of figuring out whether re-planning is worthwhile.

regards, tom lane


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-19 18:42:53
Message-ID: m2ty882x1u.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> On machines where gettimeofday is slow (and last I heard there were
> still lots of them), any such thing would be a disaster
> performance-wise. I'm still afraid to add more gettimeofday's into the
> query parse/plan/execute code path, even though it would greatly ease
> the problem of figuring out whether re-planning is worthwhile.

Excuse my ignorance here, but is SIGALARM less of a problem? Then you
could ask the system for an alarm next second and count the alarms
rather than poll the clock. We don't need high precision in both those
cases I guess.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support