stored procedure stats in collector

Lists: pgsql-hackers
From: Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: stored procedure stats in collector
Date: 2007-07-13 11:11:04
Message-ID: 46975DC8.2000000@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Howdy,

I'm working on a patch to extend the stats collector to handle stored procedure
statistics (call counts, duration etc). The goal is to make this information
visible via pg_stat functions/views. The collection would be controllable via
"stats_function_level" GUC and will have minimal overhead when turned off. At
our company we depend heavily on stored procedures and such a tool would be of
great help. Perhaps others could also find it somewhat useful.

So far I have a "working" patch against 8.2.4 (attached) that places some
instrumentation in the executor (ExecMakeFunctionResult,
ExecMakeFunctionResultNoSets and ExecMakeTableFunctionResult) and send the
stats to the collector (handled similarly to table stats). The results are
visible through pg_stat_user_functions:

martinp=# set stats_function_level = on;
SET
Time: 0.172 ms

martinp=# select count(f3()) from generate_series(1, 1000);
count
-------
1000
(1 row)

Time: 40059.713 ms

martinp=# select * from pg_stat_user_functions ;
procid | schemaname | procname | nargs | calls | total_time | total_cpu | self_time | self_cpu
--------+------------+----------+-------+-------+------------+-----------+-----------+----------
16388 | public | f1 | 0 | 4000 | 14978 | 8352 | 14978 | 8352
16389 | public | f2 | 0 | 2000 | 40044 | 8364 | 25066 | 12
16390 | public | f3 | 0 | 1000 | 40054 | 8364 | 9 | 0
(3 rows)

Time units are in milliseconds. Only functions with oid >= FirstNormalObjectId
are accounted.

This is of course still very experimental - some work is required to get this
into production shape. I was hoping to get some feedback before I continue
though. Maybe there are some obvious flaws that I'm not seeing, maybe something
needs to be changed to make this more generally useful. Feedback is greatly
appreciated.

Regards,
Martin

Attachment Content-Type Size
8.2.4-procstat.patch.gz application/x-gzip 10.9 KB

From: Neil Conway <neilc(at)samurai(dot)com>
To: Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: stored procedure stats in collector
Date: 2007-07-13 18:13:16
Message-ID: 1184350396.13129.10.camel@dell.linuxdev.us.dell.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2007-07-13 at 14:11 +0300, Martin Pihlak wrote:
> I'm working on a patch to extend the stats collector to handle stored procedure
> statistics (call counts, duration etc). The goal is to make this information
> visible via pg_stat functions/views. The collection would be controllable via
> "stats_function_level" GUC and will have minimal overhead when turned off. At
> our company we depend heavily on stored procedures and such a tool would be of
> great help. Perhaps others could also find it somewhat useful.

Very cool, certainly sounds like a useful feature to me.

> martinp=# select * from pg_stat_user_functions ;
> procid | schemaname | procname | nargs | calls | total_time | total_cpu | self_time | self_cpu
> --------+------------+----------+-------+-------+------------+-----------+-----------+----------
> 16388 | public | f1 | 0 | 4000 | 14978 | 8352 | 14978 | 8352
> 16389 | public | f2 | 0 | 2000 | 40044 | 8364 | 25066 | 12
> 16390 | public | f3 | 0 | 1000 | 40054 | 8364 | 9 | 0
> (3 rows)

(schemaname, procname, nargs) is still ambiguous in the face of function
overloading. Although the presence of procid uniquely identifies each
function anyway, if you're going to include the name and argument
information, it might be worth including the argument types as well (as
an array of regtype, perhaps).

> Only functions with oid >= FirstNormalObjectId are accounted.

Sounds reasonable to me; adding profiling to every DirectFunctionCall
invocation is likely to be too expensive anyway.

>From looking quickly at the patch, I don't think the current coding
handles set-returning functions (ExecMakeTableFunctionResult).

-Neil


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: stored procedure stats in collector
Date: 2007-07-16 01:53:59
Message-ID: 19469.1184550839@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com> writes:
> I'm working on a patch to extend the stats collector to handle stored procedure
> statistics (call counts, duration etc).
> ...
> Only functions with oid >= FirstNormalObjectId are accounted.

I really dislike that approach to deciding which functions to count.
The main problem with it is that it will try to count C-language
functions that are added after initdb, such as contrib stuff and
third-party add-ons like postgis. The percentage overhead for a
typical short C function will be large, and I'm not sure anything
much is to be gained by counting these.

I think a more reasonable approach would be to count PL-language
functions; which in turn suggests that the instrumentation hooks
should be in the PL call handlers, not in ExecMakeFunctionResult
and friends where they will drag down performance of all functions.

BTW, I dunno if you've thought about the implications of inlining
of SQL functions ... it's hard to see how to count those reasonably.

regards, tom lane


From: Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: stored procedure stats in collector
Date: 2007-07-16 12:31:13
Message-ID: 469B6511.3000402@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Neil Conway wrote:
>
> (schemaname, procname, nargs) is still ambiguous in the face of function
> overloading. Although the presence of procid uniquely identifies each
> function anyway, if you're going to include the name and argument
> information, it might be worth including the argument types as well (as
> an array of regtype, perhaps).
>

This is true. I was being a bit selfish here - in our environment we don't
use overloading that much - schema, name and nargs are usually sufficient.
It is also convinient to be able to do "select * from pg..." and have the
output not to wrap around.

>>From looking quickly at the patch, I don't think the current coding
> handles set-returning functions (ExecMakeTableFunctionResult).
>

Hmm, should be handled. The number of tuples returned is not counted though,
perhaps this is another thing to add ...

Regards,
Martin


From: Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: stored procedure stats in collector
Date: 2007-07-16 16:10:40
Message-ID: 469B9880.2090408@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> I really dislike that approach to deciding which functions to count.
> The main problem with it is that it will try to count C-language
> functions that are added after initdb, such as contrib stuff and
> third-party add-ons like postgis. The percentage overhead for a
> typical short C function will be large, and I'm not sure anything
> much is to be gained by counting these.
>
Agreed, it is a bit ugly. In the initial version there was no such
exclusion. This came only after I "accidentally" activated the
collection on a busy server. Checked cpu usage some half an hour later
and immediately disabled it. System mode consumption was unacceptably
high, but the results indicated that something funny was going
on inside application as well - some internal functions such as texteq()
and now() had unreasonably high call counts. I'm still investigating
those issues, but maybe it might be useful to have stats for internal
and C language functions as well?

Perhaps it is possible to make the collection configurable per language.
So that for instance, default is to only collect pl languages, with the
option to add C or internal. Not sure how to approach this though.

> I think a more reasonable approach would be to count PL-language
> functions; which in turn suggests that the instrumentation hooks
> should be in the PL call handlers, not in ExecMakeFunctionResult
> and friends where they will drag down performance of all functions.
>
It would be great if all the procedural languages would go through a
single entry point. Right now the more exotic PL-s would need separate
patching. But indeed, this approach would be less intrusive.

> BTW, I dunno if you've thought about the implications of inlining
> of SQL functions ... it's hard to see how to count those reasonably.
>
Yes, this creates some inconsistencies in what is collected and what not.
Unless, of course, only PL functions are counted :)

Regards,
Martin


From: Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com>
Subject: Re: stored procedure stats in collector
Date: 2007-09-20 13:08:58
Message-ID: 46F270EA.8040103@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Howdy,

Here's an updated version of the function stats patch. The biggest change is
that by default only procedural language functions are counted. The GUC variable
stats_function_level now takes 3 values: on, off and all. The latter also
counts SQL and C language functions, "on" means only to count procedural
language functions. The decision is now made in fmgr_info_cxt_security()
Trigger functions are now also counted.

Sample output:

select procname, calls, total_time, total_cpu, self_time, self_cpu
from pg_stat_user_functions order by self_cpu desc limit 5;

procname | calls | total_time | total_cpu | self_time | self_cpu
--------------------+-------+------------+-----------+-----------+----------
next_batch | 32765 | 27139 | 8574 | 27139 | 8574
fetch_batch_events | 3636 | 9252 | 5622 | 3771 | 2717
batch_event_sql | 3636 | 5454 | 2888 | 3910 | 1962
finish_batch | 3420 | 3215 | 1475 | 3215 | 1475
batch_event_tables | 3636 | 1448 | 865 | 1434 | 858
(5 rows)

This is still 8.2 only (tested on 8.2.4 and 8.2.5), has seen some production
usage here at Skype (about a month on reasonably busy boxes). So far so good.

Couple of issues:
- sometimes self_cpu > self_time - currently blaming it on Linux version of
gettimeofday().
- dropped functions are not purged from stats, might bloat the stats file for
some usage patterns.

PS. Would something like this be a canditate for 8.4 inclusion (if polished up)?

Regards,
Martin

Attachment Content-Type Size
8.2-procstat.patch.gz application/x-gzip 12.3 KB

From: Neil Conway <neilc(at)samurai(dot)com>
To: Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: stored procedure stats in collector
Date: 2007-09-21 00:34:32
Message-ID: 1190334872.16377.20.camel@dell.linuxdev.us.dell.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2007-09-20 at 16:08 +0300, Martin Pihlak wrote:
> The GUC variable stats_function_level now takes 3 values: on, off and all.

That seems a confusing set of values. Perhaps "off", "pl", and "all"
would be clearer?

I'm curious if you've measured the performance overhead of enabling this
functionality.

> PS. Would something like this be a canditate for 8.4 inclusion (if polished up)?

It sounds like a useful feature to me.

-Neil


From: Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: stored procedure stats in collector
Date: 2007-09-24 13:10:32
Message-ID: 46F7B748.7040105@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Neil Conway wrote:
> On Thu, 2007-09-20 at 16:08 +0300, Martin Pihlak wrote:
>> The GUC variable stats_function_level now takes 3 values: on, off and all.
>
> That seems a confusing set of values. Perhaps "off", "pl", and "all"
> would be clearer?

Makes sense. It appears that the stats_ prefixed GUC names are deprecated now.
Will rename to "track_functions" and change values to "off", "pl" and
"all". Or should I use "none" instead of "off"?

> I'm curious if you've measured the performance overhead of enabling this
> functionality.
>

Just finished a simple benchmark comparing unpatched, "off" and "all":

unpatched usr sys total
dummy 21.36 0.68 22.04
rfact 51.52 0.07 51.59
ifact 20.40 0.06 20.45

"off" usr sys total diff x
dummy 21.59 0.39 21.97 1.00
rfact 52.40 0.06 52.46 1.02
ifact 20.78 0.02 20.8 1.02

"all" usr sys total diff x
dummy 21.69 35.18 56.87 2.58
rfact 47.83 38.99 86.83 1.68
ifact 20.65 0.36 21.01 1.03

dummy is blank pl/pgsql function (ran 10000000 times). rfact is recursive factorial,
ifact is iterative (both ran 1000000 times). Comparing median of 3 runs. The usr and
sys values are from log_executor_stats.

It seems that the overhead is unnoticeable if disabled, very visible for lightweight
functions and heavy callers. Almost unnoticeable for more compute intensive functions.

>> PS. Would something like this be a canditate for 8.4 inclusion (if polished up)?
>
> It sounds like a useful feature to me.
>
Good. I'll bring the patch up to date with HEAD.

Regards,
Martin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com>
Cc: Neil Conway <neilc(at)samurai(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: stored procedure stats in collector
Date: 2007-09-24 15:36:56
Message-ID: 14479.1190648216@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com> writes:
> Neil Conway wrote:
>> That seems a confusing set of values. Perhaps "off", "pl", and "all"
>> would be clearer?

> Makes sense. It appears that the stats_ prefixed GUC names are deprecated now.
> Will rename to "track_functions" and change values to "off", "pl" and
> "all". Or should I use "none" instead of "off"?

"None" seems good, by analogy with log_statement's values.

> It seems that the overhead is unnoticeable if disabled, very visible
> for lightweight functions and heavy callers. Almost unnoticeable for
> more compute intensive functions.

The very high system times seem odd. Maybe you have a machine with
slow gettimeofday()?

regards, tom lane


From: Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Neil Conway <neilc(at)samurai(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: stored procedure stats in collector
Date: 2007-09-24 17:08:04
Message-ID: 46F7EEF4.7090102@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>> It seems that the overhead is unnoticeable if disabled, very visible
>> for lightweight functions and heavy callers. Almost unnoticeable for
>> more compute intensive functions.
>
> The very high system times seem odd. Maybe you have a machine with
> slow gettimeofday()?
>

So it seems (on Ubuntu 7.04). Reran the benchmark on another system and
got more reasonable results:

"none" usr sys total
dummy 15.9 0.68 16.58
rfact 39.37 0.07 39.44
ifact 12.89 0.01 12.9

"all" usr sys total diff x
dummy 20.82 4.44 25.26 1.52
rfact 45.17 4.17 49.34 1.25
ifact 12.92 0.03 12.96 1.00

Regards,
Martin


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: stored procedure stats in collector
Date: 2007-09-26 10:42:19
Message-ID: 200709261042.l8QAgJx00445@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Martin Pihlak wrote:
> Howdy,
>
> Here's an updated version of the function stats patch. The biggest change is
> that by default only procedural language functions are counted. The GUC variable
> stats_function_level now takes 3 values: on, off and all. The latter also
> counts SQL and C language functions, "on" means only to count procedural
> language functions. The decision is now made in fmgr_info_cxt_security()
> Trigger functions are now also counted.
>
> Sample output:
>
> select procname, calls, total_time, total_cpu, self_time, self_cpu
> from pg_stat_user_functions order by self_cpu desc limit 5;
>
> procname | calls | total_time | total_cpu | self_time | self_cpu
> --------------------+-------+------------+-----------+-----------+----------
> next_batch | 32765 | 27139 | 8574 | 27139 | 8574
> fetch_batch_events | 3636 | 9252 | 5622 | 3771 | 2717
> batch_event_sql | 3636 | 5454 | 2888 | 3910 | 1962
> finish_batch | 3420 | 3215 | 1475 | 3215 | 1475
> batch_event_tables | 3636 | 1448 | 865 | 1434 | 858
> (5 rows)
>
> This is still 8.2 only (tested on 8.2.4 and 8.2.5), has seen some production
> usage here at Skype (about a month on reasonably busy boxes). So far so good.
>
> Couple of issues:
> - sometimes self_cpu > self_time - currently blaming it on Linux version of
> gettimeofday().
> - dropped functions are not purged from stats, might bloat the stats file for
> some usage patterns.
>
> PS. Would something like this be a canditate for 8.4 inclusion (if polished up)?
>
> Regards,
> Martin
>

[ application/x-gzip is not supported, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
> http://www.postgresql.org/about/donate

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: hubert depesz lubaczewski <depesz(at)depesz(dot)com>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: stored procedure stats in collector
Date: 2007-09-26 12:24:09
Message-ID: 20070926122409.GA11126@depesz.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 20, 2007 at 05:34:32PM -0700, Neil Conway wrote:
> That seems a confusing set of values. Perhaps "off", "pl", and "all"
> would be clearer?
> I'm curious if you've measured the performance overhead of enabling this
> functionality.

i'm quite worried about "all" setting.
all operators are functions as well, so tracking literally all function
calls might be too heavy.

on the other hand i would like to be able to track some non-pl functions
like RI%.

depesz

--
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA. here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)


From: Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com>
To: depesz(at)depesz(dot)com
Cc: Neil Conway <neilc(at)samurai(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: stored procedure stats in collector
Date: 2007-09-26 13:08:54
Message-ID: 46FA59E6.4080508@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

hubert depesz lubaczewski wrote:
> On Thu, Sep 20, 2007 at 05:34:32PM -0700, Neil Conway wrote:
>> That seems a confusing set of values. Perhaps "off", "pl", and "all"
>> would be clearer?
>> I'm curious if you've measured the performance overhead of enabling this
>> functionality.
>
> i'm quite worried about "all" setting.
> all operators are functions as well, so tracking literally all function
> calls might be too heavy.
>
"all" would additionally count only C and SQL functions, builtins (language
"internal") are skipped. Maybe there should be option to count these as well,
perhaps with a really scare name (track_functions = all_plus_internal).

> on the other hand i would like to be able to track some non-pl functions
> like RI%.
>
These are builtin.

Regards,
Martin


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Martin Pihlak <martin(dot)pihlak(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: stored procedure stats in collector
Date: 2008-03-26 01:42:38
Message-ID: 200803260142.m2Q1gcq29597@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


This has been saved for the next commit-fest:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Martin Pihlak wrote:
> Howdy,
>
> Here's an updated version of the function stats patch. The biggest change is
> that by default only procedural language functions are counted. The GUC variable
> stats_function_level now takes 3 values: on, off and all. The latter also
> counts SQL and C language functions, "on" means only to count procedural
> language functions. The decision is now made in fmgr_info_cxt_security()
> Trigger functions are now also counted.
>
> Sample output:
>
> select procname, calls, total_time, total_cpu, self_time, self_cpu
> from pg_stat_user_functions order by self_cpu desc limit 5;
>
> procname | calls | total_time | total_cpu | self_time | self_cpu
> --------------------+-------+------------+-----------+-----------+----------
> next_batch | 32765 | 27139 | 8574 | 27139 | 8574
> fetch_batch_events | 3636 | 9252 | 5622 | 3771 | 2717
> batch_event_sql | 3636 | 5454 | 2888 | 3910 | 1962
> finish_batch | 3420 | 3215 | 1475 | 3215 | 1475
> batch_event_tables | 3636 | 1448 | 865 | 1434 | 858
> (5 rows)
>
> This is still 8.2 only (tested on 8.2.4 and 8.2.5), has seen some production
> usage here at Skype (about a month on reasonably busy boxes). So far so good.
>
> Couple of issues:
> - sometimes self_cpu > self_time - currently blaming it on Linux version of
> gettimeofday().
> - dropped functions are not purged from stats, might bloat the stats file for
> some usage patterns.
>
> PS. Would something like this be a canditate for 8.4 inclusion (if polished up)?
>
> Regards,
> Martin
>

[ application/x-gzip is not supported, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
> http://www.postgresql.org/about/donate

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +