Re: Clearing global statistics

Lists: pgsql-hackers
From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Clearing global statistics
Date: 2009-12-05 03:18:19
Message-ID: 4B19D0FB.7000100@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Since the pg_stat_bgwriter structure was introduced in 8.3, there's
never been any way to reset its statistics back to 0. A week of
analyzing data from that every day drove me crazy enough to finally fix
this with the attached patch. This implements the TODO item "Allow the
clearing of cluster-level statistics", based on previous discussion
ending at http://archives.postgresql.org/pgsql-hackers/2009-03/msg00920.php

Here's the patch in action:

gsmith=# select checkpoints_req,buffers_alloc from pg_stat_bgwriter;
checkpoints_req | buffers_alloc
-----------------+---------------
1 | 5

gsmith=# select pg_stat_reset_global();

gsmith=# select checkpoints_req,buffers_alloc from pg_stat_bgwriter;
checkpoints_req | buffers_alloc
-----------------+---------------
0 | 0

Patch is complete including docs, it's basically just pg_stat_reset with
a different clearing mechanism at the very end. My list of potential
questions here are:

-Not sure if this should be named pg_stat_rest_global (to match the way
these are called "global stats" in the source) or
pg_stat_reset_cluster. Picked the former for V1, not attached to that
decision at all. Might even make sense to use a name that makes it
obvious the pg_stat_bgwriter data is what's targeted.

-I create a new stats message type for this, but just reuse the same
message payload structure as pg_stats_reset rather than add a new
payload structure for no good reason. That's marked with two XXX s in
the code as a questionable design decision. I can implement that too if
there's some reason it's a good idea I don't know yet.

-I grabbed what looked like an appropriate unused OID. I'm never sure
if I did that right or not though, it may need to be renumbered.

Since this whole patch is basically a cut and paste job of code that was
already there, I don't really expect it to need much discussion beyond
these minor points; wouldn't have sent it in the middle of an active
CommitFest if that weren't the case.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com

Attachment Content-Type Size
stat-reset-global-v1.patch text/x-patch 5.8 KB

From: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Clearing global statistics
Date: 2009-12-07 00:44:57
Message-ID: 20091207094457.950B.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Greg Smith <greg(at)2ndquadrant(dot)com> wrote:

> This implements the TODO item "Allow the
> clearing of cluster-level statistics", based on previous discussion
> ending at http://archives.postgresql.org/pgsql-hackers/2009-03/msg00920.php

> -Not sure if this should be named pg_stat_rest_global (to match the way
> these are called "global stats" in the source) or
> pg_stat_reset_cluster. Picked the former for V1, not attached to that
> decision at all. Might even make sense to use a name that makes it
> obvious the pg_stat_bgwriter data is what's targeted.

A couple of comments:

* We will be able to reset global counters and current database's counters.
Do we need to have a method to reset other databases' counters?
Or, will pg_stat_reset_global just reset counters of all databases?

* Is it useful to have a method to reset counters separately?
For example, pg_stat_reset( which text )
which := 'buffers' | 'checkpoints' | 'tables' | 'functions' | ...

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Clearing global statistics
Date: 2009-12-07 00:50:11
Message-ID: 4B1C5143.8090303@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Itagaki Takahiro wrote:
> Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
>
>
>> -Not sure if this should be named pg_stat_rest_global (to match the way
>> these are called "global stats" in the source) or
>> pg_stat_reset_cluster. Picked the former for V1, not attached to that
>> decision at all. Might even make sense to use a name that makes it
>> obvious the pg_stat_bgwriter data is what's targeted.
>>
>
> A couple of comments:
>
> * We will be able to reset global counters and current database's counters.
> Do we need to have a method to reset other databases' counters?
> Or, will pg_stat_reset_global just reset counters of all databases?
>
> * Is it useful to have a method to reset counters separately?
> For example, pg_stat_reset( which text )
> which := 'buffers' | 'checkpoints' | 'tables' | 'functions' | ...
>
The fact that you're asking the question this way suggests to me I've
named this completely wrong. pg_stat_reset_global only resets the bits
global to all databases. It doesn't touch any of the database-specific
things that pg_stat_reset can handle right now. At the moment, the only
global information is what's in pg_stat_bgwriter: buffer statistics and
checkpoint stats. I'm thinking that I should rename this new function
to pg_stat_reset_bgwriter so it's obvious how limited its target is.
Using either "global" or "cluster" for the name is just going to leave
people thinking it acts across a much larger area than it does.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com


From: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Clearing global statistics
Date: 2009-12-07 01:01:04
Message-ID: 20091207100104.950F.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Greg Smith <greg(at)2ndquadrant(dot)com> wrote:

> I'm thinking that I should rename this new function
> to pg_stat_reset_bgwriter so it's obvious how limited its target is.

I don't think it is a good name because we might have another cluster-level
statictics not related with bgwriter in the future. I hope you will suggest
"extensible" method when you improve this area.

To be honest, I have a plan to add performance statistics counters to
postgres. It is not bgwriter's counters, but cluster-level. I'd like
to use your infrastructure in my work, too :)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Clearing global statistics
Date: 2009-12-07 01:12:20
Message-ID: 4B1C5674.5060901@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Itagaki Takahiro wrote:
> Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
>
>
>> I'm thinking that I should rename this new function
>> to pg_stat_reset_bgwriter so it's obvious how limited its target is.
>>
>
> I don't think it is a good name because we might have another cluster-level
> statictics not related with bgwriter in the future. I hope you will suggest
> "extensible" method when you improve this area.
>
I follow what you mean now. I'll take a look at allowing pg_stat_reset
to act on an input as you suggested, rather than adding more of these
UDFs for every time somebody adds a new area they want to target clearing.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2009-12-07 10:21:16
Message-ID: 9837222c0912070221w76ebea52i2b6080038109528f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/12/7 Greg Smith <greg(at)2ndquadrant(dot)com>:
> Itagaki Takahiro wrote:
>
> Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
>
>
>
> I'm thinking that I should rename this new function
> to pg_stat_reset_bgwriter so it's obvious how limited its target is.
>
>
> I don't think it is a good name because we might have another cluster-level
> statictics not related with bgwriter in the future. I hope you will suggest
> "extensible" method when you improve this area.
>
>
> I follow what you mean now.  I'll take a look at allowing pg_stat_reset to act on an input as you suggested, rather than adding more of these UDFs for every time somebody adds a new area they want to target clearing.

I have on my TODO to implement the ability to do stats reset on a
single object (say, one table only). Please take this into
consideration when you design/name this, so theres no unnecessary
overlap :-) Same goes for the stats message itself.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Clearing global statistics
Date: 2009-12-09 17:55:05
Message-ID: 1260381305.12582.0.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On sön, 2009-12-06 at 19:50 -0500, Greg Smith wrote:
> The fact that you're asking the question this way suggests to me I've
> named this completely wrong. pg_stat_reset_global only resets the
> bits
> global to all databases. It doesn't touch any of the
> database-specific
> things that pg_stat_reset can handle right now. At the moment, the
> only
> global information is what's in pg_stat_bgwriter: buffer statistics
> and
> checkpoint stats. I'm thinking that I should rename this new
> function
> to pg_stat_reset_bgwriter so it's obvious how limited its target is.
> Using either "global" or "cluster" for the name is just going to
> leave
> people thinking it acts across a much larger area than it does.

The term "shared" is used elsewhere to describe the, well, shared
catalogs.


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-12 18:46:13
Message-ID: 4B4CC375.1070409@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus Hagander wrote:
> I have on my TODO to implement the ability to do stats reset on a
> single object (say, one table only). Please take this into
> consideration when you design/name this, so theres no unnecessary
> overlap :-) Same goes for the stats message itself.
>

The idea suggested upthread was to add this:

pg_stat_reset( which text )
which := 'buffers' | 'checkpoints' | 'tables' | 'functions' | ...

Now, the way the pg_stat_bgwriter tables are computed, it doesn't
actually make sense to separate out clearing the buffers/checkpoints
stats, since one of those values is in both categories:
buffers_checkpoint. They're really all too tightly coupled to break
them apart. So I was thinking of this:

pg_stat_reset( which text )
which := 'bgwriter' | ...

I could convert the patch I've got to be an initial implementation of
this new "pg_stat_reset with a parameter", laying some useful groundwork
in the process too. Then people who want to reset more things can just
re-use that same outline and message passing mechanism, just adding
comparisons for new text and a handler to go with it--not even touching
the catalog again.

This may not mesh well with what you plan though. If pg_stat_reset is
updated to reset stats on an individual table, that could be a second
version that takes in a regclass:

pg_stat_reset('tablename'::regclass)

But that seems like a confusing bit of overloading--I can easily see
people thinking that pg_stat_reset('bgwriter') would be resetting the
stats for a relation named 'bgwriter' rather than what it actually does
if I build it that way.

So, combining with Peter's naming suggestion, I think what I should
build is:

pg_stat_reset_shared( which text )
which := 'bgwriter' | ...

Which satisfies what I'm looking for now, and future patches that need
to reset other shared across the cluster statistics can re-use this
without needing to add a whole new function/stats message. I think that
satisfies the cross-section of planned use cases we're expecting now best.

Any comments before I update my patch to do that?

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com


From: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-13 01:29:05
Message-ID: 4B4D21E1.3010905@timbira.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith escreveu:
> pg_stat_reset( which text )
> which := 'buffers' | 'checkpoints' | 'tables' | 'functions' | ...
>
What about adding 'all' too? Or the idea is resetting all global counters when
we call pg_stat_reset() (without parameters)?

--
Euler Taveira de Oliveira
http://www.timbira.com/


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-14 10:12:18
Message-ID: 9837222c1001140212i2eb025f8w79656dcd3cb840e8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/12 Greg Smith <greg(at)2ndquadrant(dot)com>:
> Magnus Hagander wrote:
>>
>> I have on my TODO to implement the ability to do stats reset on a
>> single object (say, one table only). Please take this into
>> consideration when you design/name this, so theres no unnecessary
>> overlap :-) Same goes for the stats message itself.
>>
>
> The idea suggested upthread was to add this:
>
> pg_stat_reset( which text )
>      which := 'buffers' | 'checkpoints' | 'tables' | 'functions' |  ...
>
> Now, the way the pg_stat_bgwriter tables are computed, it doesn't actually make sense to separate out clearing the buffers/checkpoints stats, since one of those values is in both categories:  buffers_checkpoint.  They're really all too tightly coupled to break them apart.  So I was thinking of this:
>
> pg_stat_reset( which text )
>      which := 'bgwriter' | ...
>
> I could convert the patch I've got to be an initial implementation of this new "pg_stat_reset with a parameter", laying some useful groundwork in the process too.  Then people who want to reset more things can just re-use that same outline and message passing mechanism, just adding comparisons for new text and a handler to go with it--not even touching the catalog again.
>
> This may not mesh well with what you plan though.  If pg_stat_reset is updated to reset stats on an individual table, that could be a second version that takes in a regclass:
>
> pg_stat_reset('tablename'::regclass)
>
> But that seems like a confusing bit of overloading--I can easily see people thinking that pg_stat_reset('bgwriter') would be resetting the stats for a relation named 'bgwriter' rather than what it actually does if I build it that way.
>
> So, combining with Peter's naming suggestion, I think what I should build is:
>
> pg_stat_reset_shared( which text )
>      which := 'bgwriter' | ...
>
> Which satisfies what I'm looking for now, and future patches that need to reset other shared across the cluster statistics can re-use this without needing to add a whole new function/stats message.  I think that satisfies the cross-section of planned use cases we're expecting now best.
>
> Any comments before I update my patch to do that?

Are you planning to get this in for the CF? (Yes, I realize there are
only hours left). This is functionality I'd *really* like to see in
8.5, so I'll be happy to work with you to get that committed inside or
outside CF bounds, but it's easier if it's on there for reviewers ;)
(plus, the "outside cf bounds" really only works *before* the
commitfest :P)

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Rafael Martinez <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-14 13:42:59
Message-ID: 4B4F1F63.8050606@usit.uio.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Greg Smith wrote:
> Magnus Hagander wrote:
>> I have on my TODO to implement the ability to do stats reset on a
>> single object (say, one table only). Please take this into
>> consideration when you design/name this, so theres no unnecessary
>> overlap :-) Same goes for the stats message itself.
>>
[.........]
>
> Any comments before I update my patch to do that?
>

Hello

One thing I miss from the statistics you can get via pg_stat_* is
information about how long we have been collecting stats (or in other
words, when was the last time the stats were reset)

Statistics without time period information are unfortunately not very
usefull for a DBA :-(

Before 8.3, we had the stats_reset_on_server_start parameter and the
pg_postmaster_start_time() function. This was an easy way of resetting
*all* statistics delivered by pg_stat_* and knowing when this was done.
We were able to produce stats with information about sec/hours/days
average values in an easy way.

I tried to discuss this some time ago but we did not get anywhere,
Ref: http://archives.postgresql.org/pgsql-general/2009-07/msg00614.php

Maybe this time? :-)

Is there any chance of implementing a way of knowing when was the last
time statistics delivered via pg_stat_* were reset?

regards,
- --
Rafael Martinez, <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>
Center for Information Technology Services
University of Oslo, Norway

PGP Public Key: http://folk.uio.no/rafael/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.7 (GNU/Linux)

iD8DBQFLTx9cBhuKQurGihQRAnTZAJ9afYGu4UShAha0L6Z3OFyqgJ6SJQCffEow
sfFKKoT3ODap6JRpn2I1IfI=
=bCqY
-----END PGP SIGNATURE-----


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Rafael Martinez <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-14 16:01:00
Message-ID: 24296.1263484860@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Rafael Martinez <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no> writes:
> Is there any chance of implementing a way of knowing when was the last
> time statistics delivered via pg_stat_* were reset?

Actually, that brings up a more general question: what's with the
enthusiasm for clearing statistics *at all*? ISTM that's something
you should do only in dire emergencies, like the collector went
haywire and has now got a bunch of garbage numbers. The notion of
resetting subsets of the stats seems even more dubious, because now
you have numbers that aren't mutually comparable. So I fail to
understand why the desire to expend valuable development time on
any of this.

regards, tom lane


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Rafael Martinez <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-14 17:42:41
Message-ID: 4B4F5791.5080303@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Actually, that brings up a more general question: what's with the
> enthusiasm for clearing statistics *at all*? ISTM that's something
> you should do only in dire emergencies, like the collector went
> haywire and has now got a bunch of garbage numbers. The notion of
> resetting subsets of the stats seems even more dubious, because now
> you have numbers that aren't mutually comparable. So I fail to
> understand why the desire to expend valuable development time on
> any of this.
>

When doing checkpoint tuning, the usual thing you start with is by
considering the ratio of time to segment-based checkpoints, along with
the corresponding balance of buffers written by the backends vs. the
checkpoint. When that shows poor behavior, typically because
checkpoint_segments is too low, you change its value and then resume
monitoring at the new setting. Right now, you're still carrying around
the history of the bad period forever though, and every check of the
pg_stat_bgwriter requires manually subtracting the earlier values out.
What people would like to do is reset those after adjusting
checkpoint_segments, and then you can eyeball the proportions directly
instead. That's exactly what the patch does. If I didn't see this
request in the field every month I wouldn't have spent a minute on a
patch to add it.

There was a suggestion that subsets of the data I'm clearing might be
useful to target, which I rejected on the bounds that it made it
possible to get an inconsistent set of results as you're concerned
about. You really need to clear everything that shows up in
pg_stat_bgwriter or not touch it at all. The main use case I'm trying
to support is the person who just made a config change and now wants to do:

select pg_stat_reset();
select pg_stat_reset_shared('bgwriter');

So that all of the stats they're now dealing with are from the same
post-tuning time period. Having numbers that are "mutually comparable"
across the whole system is exactly the reason why this new call is
needed, because there's this one part you just can't touch.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-14 17:44:37
Message-ID: 4B4F5805.5040403@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Euler Taveira de Oliveira wrote:
> Greg Smith escreveu:
>
>> pg_stat_reset( which text )
>> which := 'buffers' | 'checkpoints' | 'tables' | 'functions' | ...
>>
>>
> What about adding 'all' too? Or the idea is resetting all global counters when
> we call pg_stat_reset() (without parameters)?
>

Once there's more than one piece to clear maybe adding in an 'all'
target makes sense. In the context of the update patch I've finished,
it just doesn't make sense given the code involved.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Rafael Martinez <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>, Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-14 17:46:50
Message-ID: 9837222c1001140946r55c2d665ve9b7d26fe800b2dd@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/14 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Rafael Martinez <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no> writes:
>> Is there any chance of implementing a way of knowing when was the last
>> time statistics delivered via pg_stat_* were reset?
>
> Actually, that brings up a more general question: what's with the
> enthusiasm for clearing statistics *at all*?  ISTM that's something
> you should do only in dire emergencies, like the collector went
> haywire and has now got a bunch of garbage numbers.  The notion of
> resetting subsets of the stats seems even more dubious, because now
> you have numbers that aren't mutually comparable.  So I fail to
> understand why the desire to expend valuable development time on
> any of this.

s/collector/application/ and you've got one reason.

Example, that I hit the other day. Examining pg_stat_user_functions
shows one function taking much longer than you'd expect. Called about
6 million times, total time about 7 days spent. Reason turned out to
be a missing index. Without clearing the stats, it'll take a *long*
time before the average goes down enough to make it possible to use
the simple SELECT self_time/calls FROM pg_stat_user_functions WHERE...
to monitor. Sure, if you have a system that graphs it, it'll update
properly, but for the quick manual checks, that view suddenly becomes
a lot less ueful.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Rafael Martinez <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-14 17:54:20
Message-ID: 4B4F5A4C.5000802@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Rafael Martinez wrote:
> One thing I miss from the statistics you can get via pg_stat_* is
> information about how long we have been collecting stats (or in other
> words, when was the last time the stats were reset)
>

I've considered adding this for the same reasons you're asking about it,
but am not happy with the trade-offs involved. The problem is that you
have to presume the server was running the entirety of the time since
stats were reset for that data to be useful. So unless people are in
that situation, they're going to get data that may not represent what
they think it does. Realistically, if you want a timestamp that always
means something useful you have to rest the stats at every server start,
which leads us to:

> Before 8.3, we had the stats_reset_on_server_start parameter and the
> pg_postmaster_start_time() function. This was an easy way of resetting
> *all* statistics delivered by pg_stat_* and knowing when this was done.
> We were able to produce stats with information about sec/hours/days
> average values in an easy way.
>

With this new feature I'm submitting, you can adjust your database
startup scripts to make this happen again. Start the server,
immediately loop over every database and call pg_stat_reset on them all,
and call pg_stat_reset_shared('bgwriter'). Now you've got completely
cleared stats that are within a second or two of
pg_postmaster_start_time(), should be close enough to most purposes.
Theoretically we could automate that better, but I've found it hard to
justify working on given that it's not that difficult to handle outside
of the database once the individual pieces are exposed.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Rafael Martinez <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-14 18:11:23
Message-ID: 26349.1263492683@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith <greg(at)2ndquadrant(dot)com> writes:
> Tom Lane wrote:
>> Actually, that brings up a more general question: what's with the
>> enthusiasm for clearing statistics *at all*?

> ... Right now, you're still carrying around
> the history of the bad period forever though, and every check of the
> pg_stat_bgwriter requires manually subtracting the earlier values out.

Seems like a more appropriate solution would be to make it easier to do
that subtraction, ie, make it easier to capture the values at a given
time point and then get deltas from there. It's more general (you could
have multiple saved sets of values), and doesn't require superuser
permissions to do, and doesn't have the same potential for
damn-I-wish-I-hadn't-done-that moments.

regards, tom lane


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Rafael Martinez <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-14 18:46:47
Message-ID: 4B4F6697.8040808@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Seems like a more appropriate solution would be to make it easier to do
> that subtraction, ie, make it easier to capture the values at a given
> time point and then get deltas from there. It's more general (you could
> have multiple saved sets of values), and doesn't require superuser
> permissions to do, and doesn't have the same potential for
> damn-I-wish-I-hadn't-done-that moments.
>

You can make the same argument about the existing pg_stat_reset
mechanism. I would love to completely rework the stats infrastructure
so that it's easier to capture values with timestamps, compute diffs,
and do trending. However, I'm not sure the database itself is
necessarily the best place to do that at anyway. People who know what
they're doing are already handling this exact job using external tools
that grab regular snapshots for that purpose, so why try to duplicate
that work?

I'm trying to triage here, to scrub off the worst of the common
problems. I would never claim this is the perfect direction to follow
forever. There are a number of people who consider the inability to
reset the pg_stat_bgwriter stats in any way a bug that's gone unfixed
for two versions now. Your larger point that this style of
implementation is not ideal as a long-term way to manage statistics I
would completely agree with, I just don't have the time to spend on a
major rewrite to improve that. What I can offer is a fix for the most
common issue I get complaints about, in the form of a tool much more
likely to be used correctly by people who go looking for it than misused
IMHO.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com


From: Rafael Martinez <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Clearing global statistics
Date: 2010-01-14 21:04:23
Message-ID: 4B4F86D7.8070802@usit.uio.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith wrote:
>
>> Before 8.3, we had the stats_reset_on_server_start parameter and
>> the pg_postmaster_start_time() function. This was an easy way of
>> resetting *all* statistics delivered by pg_stat_* and knowing when
>> this was done. We were able to produce stats with information about
>> sec/hours/days average values in an easy way.
>>
>
> With this new feature I'm submitting, you can adjust your database
> startup scripts to make this happen again. Start the server,
> immediately loop over every database and call pg_stat_reset on them
> all, and call pg_stat_reset_shared('bgwriter'). Now you've got
> completely cleared stats that are within a second or two of
> pg_postmaster_start_time(), should be close enough to most purposes.
> Theoretically we could automate that better, but I've found it hard
> to justify working on given that it's not that difficult to handle
> outside of the database once the individual pieces are exposed.
>

Great, this is good enough and we get what we need. Thanks :-)

regards
--
Rafael Martinez, <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>
Center for Information Technology Services
University of Oslo, Norway

PGP Public Key: http://folk.uio.no/rafael/


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Clearing global statistics
Date: 2010-01-14 21:20:22
Message-ID: 4B4F8A96.5080004@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Itagaki Takahiro wrote:
> To be honest, I have a plan to add performance statistics counters to
> postgres. It is not bgwriter's counters, but cluster-level. I'd like
> to use your infrastructure in my work, too :)
>

Attached patch provides just that. It still works basically the same as
my earlier version, except you pass it a name of what you want to reset,
and if you don't give it the only valid one right now ('bgwriter') it
rejects it (for now):

gsmith=# select checkpoints_req,buffers_alloc from pg_stat_bgwriter;
checkpoints_req | buffers_alloc
-----------------+---------------
4 | 129
(1 row)

gsmith=# select pg_stat_reset_shared('bgwriter');
pg_stat_reset_shared
----------------------

(1 row)

gsmith=# select checkpoints_req,buffers_alloc from pg_stat_bgwriter;
checkpoints_req | buffers_alloc
-----------------+---------------
0 | 7
(1 row)

gsmith=# select pg_stat_reset_shared('rubbish');
ERROR: Unrecognized reset target

I turn the input text into an enum choice as part of composing the
message to the stats collector. If you wanted to add some other shared
cluster-wide reset capabilities into there you could re-use most of this
infrastructure. Just add an extra enum value, map the text into that
enum, and write the actual handler that does the reset work. Should be
able to reuse the same new message type and external UI I implemented
for this specific clearing feature.

I didn't see any interaction to be concerned about here with Magnus's
suggestion he wanted to target stats reset on objects such as a single
table at some point.

The main coding choice I wasn't really sure about is how I flag the
error case where you pass bad in. I do that validation and throw
ERRCODE_SYNTAX_ERROR before composing the message to the stats
collector. Didn't know if I should create a whole new error code just
for this specific case or if reusing another error code was more
appropriate. Also, I didn't actually have the collector process itself
validate the data at all, it just quietly ignores bad messages on the
presumption everything is already being checked during message creation.
That seems consistent with the other code here--the other message
handlers only seem to throw errors when something really terrible
happens, not when they just don't find something useful to do.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com

Attachment Content-Type Size
stat-reset-global-v3.patch text/x-patch 7.6 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Rafael Martinez <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-17 00:33:13
Message-ID: 603c8f071001161633m27d80fbdje1c5cf28ae72c855@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 14, 2010 at 1:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Greg Smith <greg(at)2ndquadrant(dot)com> writes:
>> Tom Lane wrote:
>>> Actually, that brings up a more general question: what's with the
>>> enthusiasm for clearing statistics *at all*?
>
>> ... Right now, you're still carrying around
>> the history of the bad period forever though, and every check of the
>> pg_stat_bgwriter requires manually subtracting the earlier values out.
>
> Seems like a more appropriate solution would be to make it easier to do
> that subtraction, ie, make it easier to capture the values at a given
> time point and then get deltas from there.  It's more general (you could
> have multiple saved sets of values), and doesn't require superuser
> permissions to do, and doesn't have the same potential for
> damn-I-wish-I-hadn't-done-that moments.

True, but it's also more complicated to use. Most systems I'm
familiar with[1] that have performance counters just provide an option
to clear them. Despite the disadvantages you cite, it seems to be
fairly useful in practice; anyway, I have found it so.

...Robert

[1] The other design I've seen is a system that automatically resets,
say, once a day. It retains the statistics for the 24-hour period
between the most recent two resets, and the statistics for the partial
period following the last reset. But that doesn't seem appropriate
for PostgreSQL....


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-17 15:29:39
Message-ID: 9837222c1001170729w335cc428h3db761b75563ce83@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/1/14 Greg Smith <greg(at)2ndquadrant(dot)com>:
> Itagaki Takahiro wrote:
>>
>> To be honest, I have a plan to add performance statistics counters to
>> postgres. It is not bgwriter's counters, but cluster-level. I'd like
>> to use your infrastructure in my work, too :)
>>
>
> Attached patch provides just that. It still works basically the same as my earlier version, except you pass it a name of what you want to reset, and if you don't give it the only valid one right now ('bgwriter') it rejects it (for now):
>
> gsmith=# select checkpoints_req,buffers_alloc from pg_stat_bgwriter;
> checkpoints_req | buffers_alloc
> -----------------+---------------
> 4 | 129
> (1 row)
>
> gsmith=# select pg_stat_reset_shared('bgwriter');
> pg_stat_reset_shared
> ----------------------
>
> (1 row)
>
> gsmith=# select checkpoints_req,buffers_alloc from pg_stat_bgwriter;
> checkpoints_req | buffers_alloc
> -----------------+---------------
> 0 | 7
> (1 row)
>
> gsmith=# select pg_stat_reset_shared('rubbish');
> ERROR: Unrecognized reset target
>
> I turn the input text into an enum choice as part of composing the message to the stats collector. If you wanted to add some other shared cluster-wide reset capabilities into there you could re-use most of this infrastructure. Just add an extra enum value, map the text into that enum, and write the actual handler that does the reset work. Should be able to reuse the same new message type and external UI I implemented for this specific clearing feature.

Yeah, that seems fine.

> I didn't see any interaction to be concerned about here with Magnus's suggestion he wanted to target stats reset on objects such as a single table at some point.

Agreed.

> The main coding choice I wasn't really sure about is how I flag the error case where you pass bad in. I do that validation and throw ERRCODE_SYNTAX_ERROR before composing the message to the stats collector. Didn't know if I should create a whole new error code just for this specific case or if reusing another error code was more appropriate. Also, I didn't actually have the collector process itself validate the data at all, it just quietly ignores bad messages on the presumption everything is already being checked during message creation. That seems consistent with the other code here--the other message handlers only seem to throw errors when something really terrible happens, not when they just don't find something useful to do.

Creating a whole new error code seems completely wrong.
ERRCODE_SYNTAX_ERROR seems fine to me. As does the not checking the
other options.

I looked the patch over, and I only really see one thing to comment on which is:
+ if (strcmp(target, "bgwriter") == 0)
+ msg.m_resettarget = RESET_BGWRITER;
+ else
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("Unrecognized reset target")));
+ }

Maybe this should be "Unrecognized reset target: %s", target, and also
a errhint() saying which targets are allowed. Thoughts?

As for the discussion about if this is useful or not, I definitely
think it is. Yes, there are certainly general improvements that could
be done to the collection mechanism to make some things easier, but
that doesn't mean we shouldn't make the current solution more useful
until we (potentially) have it.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-17 23:52:11
Message-ID: 4B53A2AB.3060700@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus Hagander wrote:
> Maybe this should be "Unrecognized reset target: %s", target, and also
> a errhint() saying which targets are allowed. Thoughts?
>

That seems reasonable. The other thing I realized is that I forgot to
add the new function to the right place in doc/src/sgml/func.sgml :

<indexterm>
<primary>pg_stat_reset_shared</primary>
</indexterm>

I can send an updated patch with both of these things fixed tomorrow.
Given that we're talking 5 lines of change here, if it's easier for you
to just patch a working copy you've already started on that's fine too.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-18 10:02:29
Message-ID: 9837222c1001180202i79311495qfbe0f5ce4c0c78f1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 18, 2010 at 00:52, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> Magnus Hagander wrote:
>>
>> Maybe this should be "Unrecognized reset target: %s", target, and also
>> a errhint() saying which targets are allowed. Thoughts?
>>
>
> That seems reasonable.  The other thing I realized is that I forgot to add
> the new function to the right place in doc/src/sgml/func.sgml :
>
>   <indexterm>
>    <primary>pg_stat_reset_shared</primary>
>   </indexterm>
>
> I can send an updated patch with both of these things fixed tomorrow.  Given
> that we're talking 5 lines of change here, if it's easier for you to just
> patch a working copy you've already started on that's fine too.

I can deal with that, I just wanted to make sure we're in agreement.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clearing global statistics
Date: 2010-01-19 14:12:21
Message-ID: 9837222c1001190612o1dbbac7esd515abd75afee694@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 18, 2010 at 11:02, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Mon, Jan 18, 2010 at 00:52, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
>> Magnus Hagander wrote:
>>>
>>> Maybe this should be "Unrecognized reset target: %s", target, and also
>>> a errhint() saying which targets are allowed. Thoughts?
>>>
>>
>> That seems reasonable.  The other thing I realized is that I forgot to add
>> the new function to the right place in doc/src/sgml/func.sgml :
>>
>>   <indexterm>
>>    <primary>pg_stat_reset_shared</primary>
>>   </indexterm>
>>
>> I can send an updated patch with both of these things fixed tomorrow.  Given
>> that we're talking 5 lines of change here, if it's easier for you to just
>> patch a working copy you've already started on that's fine too.
>
> I can deal with that, I just wanted to make sure we're in agreement.

Applied with agreed upon change and some minor styling/formatting changes.

The mention you have about documentation above is incorrect - the
other pg_stat_xyz functions aren't listed there, so neither should
this one. At least from AFAICT ;)

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/