WIP: executor_hook for pg_stat_statements

Lists: pgsql-hackerspgsql-patches
From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-patches(at)postgresql(dot)org
Subject: WIP: executor_hook for pg_stat_statements
Date: 2008-06-23 06:22:57
Message-ID: 20080623150535.946E.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

I'm working on light-weight SQL logging for PostgreSQL.
http://archives.postgresql.org/pgsql-hackers/2008-06/msg00601.php

I divide the SQL logging feature into a core patch and an extension module.
I hope only the patch is to be applied in the core. The extension module
would be better to be developed separately from the core.

The attached patch (executor_hook.patch) modifies HEAD as follows.

- Add "tag" field (uint32) into PlannedStmt.
- Add executor_hook to replace ExecutePlan().
- Move ExecutePlan() to a global function.

The archive file (pg_stat_statements.tar.gz) is a sample extension module.
It uses the existing planner_hook and the new executor_hook to record
statements on planned and executed. You can see all of executed statements
through the following VIEW:

View "public.pg_stat_statements"
Column | Type | Description
------------+--------+------------------------------------
userid | oid | user id who execute the statement
datid | oid | target database
query | text | query's SQL text
planned | bigint | number of planned
calls | bigint | number of executed
total_time | bigint | total executing time in msec

Here is a sample output of the view.

postgres=# SELECT pg_stat_statements_reset();
$ pgbench -c10 -t1000 -M prepared
postgres=# SELECT * FROM pg_stat_statements ORDER BY query;
userid | datid | query | planned | calls | total_time
--------+-------+-----------------------------------------------------------------------------------------------+---------+-------+------------
10 | 11505 | INSERT INTO history (tid, bid, aid, delta, mtime) VALUES ($1, $2, $3, $4, CURRENT_TIMESTAMP); | 10 | 10000 | 196
10 | 11505 | SELECT * FROM pg_stat_statements ORDER BY query; | 1 | 0 | 0
10 | 11505 | SELECT abalance FROM accounts WHERE aid = $1; | 10 | 10000 | 288
10 | 11505 | UPDATE accounts SET abalance = abalance + $1 WHERE aid = $2; | 10 | 10000 | 1269
10 | 11505 | UPDATE branches SET bbalance = bbalance + $1 WHERE bid = $2; | 10 | 10000 | 21737
10 | 11505 | UPDATE tellers SET tbalance = tbalance + $1 WHERE tid = $2; | 10 | 10000 | 6950
10 | 11505 | delete from history | 1 | 1 | 0
10 | 11505 | select count(*) from branches | 1 | 1 | 0
(8 rows)

You need to add the below options in postgresql.conf.
shared_preload_libraries = 'pg_stat_statements'
custom_variable_classes = 'statspack'
statspack.max_statements = 1000 # max number of distinct statements
statspack.statement_buffer = 1024 # buffer to record SQL text

This module is WIP and far from complete. It allocates fixed shared
memory and record SQLs there, but doesn't handle out-of-memory situaton
for now. Also, It can handle statements using extended prorocol or
prepared statements, but not simple protocol queries. And every user
can view other user's queries.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
pg_stat_statements.tar.gz application/octet-stream 3.6 KB
executor_hook.patch application/octet-stream 3.6 KB

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: WIP: executor_hook for pg_stat_statements
Date: 2008-07-04 08:44:25
Message-ID: 1215161065.4051.26.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Mon, 2008-06-23 at 15:22 +0900, ITAGAKI Takahiro wrote:
> I'm working on light-weight SQL logging for PostgreSQL.
> http://archives.postgresql.org/pgsql-hackers/2008-06/msg00601.php
>
> I divide the SQL logging feature into a core patch and an extension module.
> I hope only the patch is to be applied in the core. The extension module
> would be better to be developed separately from the core.
>
>
> The attached patch (executor_hook.patch) modifies HEAD as follows.
>
> - Add "tag" field (uint32) into PlannedStmt.
> - Add executor_hook to replace ExecutePlan().
> - Move ExecutePlan() to a global function.

The executor_hook.patch is fairly trivial and I see no errors.

The logic of including such a patch is clear. If we have a planner hook
then we should also have an executor hook.

Will you be completing the plugin for use in contrib?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] WIP: executor_hook for pg_stat_statements
Date: 2008-07-07 02:03:32
Message-ID: 20080707104202.73E5.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> > The attached patch (executor_hook.patch) modifies HEAD as follows.
> >
> > - Add "tag" field (uint32) into PlannedStmt.
> > - Add executor_hook to replace ExecutePlan().
> > - Move ExecutePlan() to a global function.
>
> The executor_hook.patch is fairly trivial and I see no errors.
>
> The logic of including such a patch is clear. If we have a planner hook
> then we should also have an executor hook.

One issue is "tag" field. The type is now uint32. It's enough in my plugin,
but if some people need to add more complex structures in PlannedStmt,
Node type would be better rather than uint32. Which is better?

> Will you be completing the plugin for use in contrib?

Yes, I'll fix memory management in my plugin and re-post it
by the next commit-fest.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] WIP: executor_hook for pg_stat_statements
Date: 2008-07-07 07:10:00
Message-ID: 1215414601.4051.432.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Mon, 2008-07-07 at 11:03 +0900, ITAGAKI Takahiro wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> > > The attached patch (executor_hook.patch) modifies HEAD as follows.
> > >
> > > - Add "tag" field (uint32) into PlannedStmt.
> > > - Add executor_hook to replace ExecutePlan().
> > > - Move ExecutePlan() to a global function.
> >
> > The executor_hook.patch is fairly trivial and I see no errors.
> >
> > The logic of including such a patch is clear. If we have a planner hook
> > then we should also have an executor hook.
>
> One issue is "tag" field. The type is now uint32. It's enough in my plugin,
> but if some people need to add more complex structures in PlannedStmt,
> Node type would be better rather than uint32. Which is better?

I was imagining that tag was just an index to another data structure,
but probably better if its a pointer.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] WIP: executor_hook for pg_stat_statements
Date: 2008-07-07 14:51:03
Message-ID: 10029.1215442263@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Mon, 2008-07-07 at 11:03 +0900, ITAGAKI Takahiro wrote:
>> One issue is "tag" field. The type is now uint32. It's enough in my plugin,
>> but if some people need to add more complex structures in PlannedStmt,
>> Node type would be better rather than uint32. Which is better?

> I was imagining that tag was just an index to another data structure,
> but probably better if its a pointer.

I don't want the tag there at all, much less converted to a pointer.
What would the semantics be of copying the node, and why?

Please justify why you must have this and can't do what you want some
other way.

regards, tom lane


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] WIP: executor_hook for pg_stat_statements
Date: 2008-07-08 01:29:04
Message-ID: 20080708091907.6D35.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I don't want the tag there at all, much less converted to a pointer.
> What would the semantics be of copying the node, and why?
>
> Please justify why you must have this and can't do what you want some
> other way.

In my pg_stat_statements plugin, the tag is used to cache hash values of
SQL strings in PlannedStmt. It is not necessarily needed because the hash
value is re-computable from debug_query_string. It is just for avoiding
the work. In addition, we see different SQLs in debug_query_string in
PREPARE/EXECUTE and DECLARE/FETCH. Hashed SQL cache can work on those
commands.

However, it's ok to remove the tag field from the patch if we should
avoid such an unused field in normal use. EXECUTE and FETCH issues
could be solved if I write simple SQL parser in my plugin because SQL
bodies can be fetched from pg_prepared_statements and pg_cursors.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] WIP: executor_hook for pg_stat_statements
Date: 2008-07-08 12:16:31
Message-ID: 1215519391.4051.876.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Mon, 2008-07-07 at 10:51 -0400, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > On Mon, 2008-07-07 at 11:03 +0900, ITAGAKI Takahiro wrote:
> >> One issue is "tag" field. The type is now uint32. It's enough in my plugin,
> >> but if some people need to add more complex structures in PlannedStmt,
> >> Node type would be better rather than uint32. Which is better?
>
> > I was imagining that tag was just an index to another data structure,
> > but probably better if its a pointer.
>
> I don't want the tag there at all, much less converted to a pointer.
> What would the semantics be of copying the node, and why?
>
> Please justify why you must have this and can't do what you want some
> other way.

Agreed. If we have plugins for planner and executor we should be able to
pass information around in the background. We have mechanisms for two
plugins to rendezvous, so we can use that if they're completely separate
plugins.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] WIP: executor_hook for pg_stat_statements
Date: 2008-07-10 20:11:42
Message-ID: 13432.1215720702@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I don't want the tag there at all, much less converted to a pointer.
>> What would the semantics be of copying the node, and why?
>>
>> Please justify why you must have this and can't do what you want some
>> other way.

> In my pg_stat_statements plugin, the tag is used to cache hash values of
> SQL strings in PlannedStmt. It is not necessarily needed because the hash
> value is re-computable from debug_query_string. It is just for avoiding
> the work. In addition, we see different SQLs in debug_query_string in
> PREPARE/EXECUTE and DECLARE/FETCH. Hashed SQL cache can work on those
> commands.

Actually, that aspect of the plugin is 100% broken anyway, because it
assumes that debug_query_string has got something to do with the query
being executed. There are any number of scenarios where this is a bad
assumption.

I wonder whether we ought to change things so that the real query
source text is available at the executor level. Since we are (at least
usually) storing the query text in cached plans, I think this might just
require some API refactoring, not extra space and copying. It would
amount to a permanent decision that we're willing to pay the overhead
of keeping the source text around, though.

Also, after looking at the patch more closely, was there a good reason
for making the hook intercept ExecutePlan rather than ExecutorRun?
ExecutePlan was never intended to have a stable public API --- its
argument list is just a happenstance of what ExecutorRun needs to
fetch for its own purposes. I think we should keep it private and
have ExecutorRun do

if (hook)
hook(...);
else
standard_ExecutorRun(...);

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] WIP: executor_hook for pg_stat_statements
Date: 2008-07-15 04:51:31
Message-ID: 1216097491.19656.44.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Thu, 2008-07-10 at 16:11 -0400, Tom Lane wrote:
> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> > Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> I don't want the tag there at all, much less converted to a pointer.
> >> What would the semantics be of copying the node, and why?
> >>
> >> Please justify why you must have this and can't do what you want some
> >> other way.
>
> > In my pg_stat_statements plugin, the tag is used to cache hash values of
> > SQL strings in PlannedStmt. It is not necessarily needed because the hash
> > value is re-computable from debug_query_string. It is just for avoiding
> > the work. In addition, we see different SQLs in debug_query_string in
> > PREPARE/EXECUTE and DECLARE/FETCH. Hashed SQL cache can work on those
> > commands.
>
> Actually, that aspect of the plugin is 100% broken anyway, because it
> assumes that debug_query_string has got something to do with the query
> being executed. There are any number of scenarios where this is a bad
> assumption.

> I wonder whether we ought to change things so that the real query
> source text is available at the executor level. Since we are (at least
> usually) storing the query text in cached plans, I think this might just
> require some API refactoring, not extra space and copying. It would
> amount to a permanent decision that we're willing to pay the overhead
> of keeping the source text around, though.

I think its a reasonable decision to do that. Knowing what you're doing
while you do it is pretty important.

We should look to keep some kind of tag around though. It would be
useful to avoid performing operations on the SQL string itself and just
keep it for display. I would imagine re-hashing the plan each time we
execute it would cost more than we would like, *especially* when running
a performance profiling plugin.

> Also, after looking at the patch more closely, was there a good reason
> for making the hook intercept ExecutePlan rather than ExecutorRun?
> ExecutePlan was never intended to have a stable public API --- its
> argument list is just a happenstance of what ExecutorRun needs to
> fetch for its own purposes. I think we should keep it private and
> have ExecutorRun do
>
> if (hook)
> hook(...);
> else
> standard_ExecutorRun(...);

Much better place.

That raises the question of whether we should have ExecutorStart() and
ExecutorEnd() hooks as well, to round things off.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] WIP: executor_hook for pg_stat_statements
Date: 2008-07-15 07:25:46
Message-ID: 20080715150758.3016.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> > I wonder whether we ought to change things so that the real query
> > source text is available at the executor level. Since we are (at least
> > usually) storing the query text in cached plans, I think this might just
> > require some API refactoring, not extra space and copying. It would
> > amount to a permanent decision that we're willing to pay the overhead
> > of keeping the source text around, though.
>
> I think its a reasonable decision to do that. Knowing what you're doing
> while you do it is pretty important.

I worked around to it, I found I can use ActivePortal->sourceText in some
situations. But there are still some problems:

- SQL functions:
They don't modify ActivePortal->sourceText, but we could get the
source from SQLFunctionCache->src. If it is required, we might
need to add a new field in QueryDesc and copy the src to the field.
- Multiple queries:
Query text is not divided into each query.
Only the original combined text is available.
- RULEs:
There are similar issues with multiple queries.
Also, they don't have original query texts.

The same can be said for planner_hook(). Only available query text is
debug_query_string in it, and it is the top-level query. We cannot use
the actual SQL text which the Query object comes from. The treu query
text might be SQL functions used in the top-level query, a part of multiple
queries, or another query rewritten by RULE.

For these reasons, now I'm thinking to collect only top-query level
statistics, not per-planner+executor level statistics. i.e, when we
receive a multiple query "SELECT 1; SELECT 2;", pg_stat_statements uses
the original combined text as a key. Comsumed resource associated with
the key is sum of resources used in both "SELECT 1" and "SELECT 2".

> > Also, after looking at the patch more closely, was there a good reason
> > for making the hook intercept ExecutePlan rather than ExecutorRun?
>
> That raises the question of whether we should have ExecutorStart() and
> ExecutorEnd() hooks as well, to round things off.

Yeah, and also ExecutorRewind() hook. There are 4 interface functions in
executor. My addin only needs Run hook because it doesn't modify the actual
behavior of executor. However, when someone hope to replace the behavior,
they need all of the hooks. (Is multi-threaded executor project still alive?)

How about adding new Executor class
and ExecutorStart() returns an instance of Executor?

typedef struct Executor
{
ExecutorRunFunc run;
ExecutorEndFunc end;
ExecutorRewindFunc rewind;
/* there might be private fields. */
} Executor;

Executor *e = ExecutorStart_hook(...);
ExecutorRun(e, ...) => { e->run(e, ...); }
ExecutorEnd(e, ...) => { e->end(e, ...); }

It could be make APIs cleaner because QueryDesc has 3 fields only for
executor (tupDesc, estate, planstate). We can move those fields to
Executor's private fields. Is this modification acceptable?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] WIP: executor_hook for pg_stat_statements
Date: 2008-07-15 08:31:29
Message-ID: 1216110689.19656.132.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Tue, 2008-07-15 at 16:25 +0900, ITAGAKI Takahiro wrote:
> > > Also, after looking at the patch more closely, was there a good
> reason
> > > for making the hook intercept ExecutePlan rather than ExecutorRun?
> >
> > That raises the question of whether we should have ExecutorStart()
> and
> > ExecutorEnd() hooks as well, to round things off.
>
> Yeah, and also ExecutorRewind() hook. There are 4 interface functions
> in executor. My addin only needs Run hook because it doesn't modify
> the actual behavior of executor. However, when someone hope to replace
> the behavior, they need all of the hooks. (Is multi-threaded executor
> project still alive?)

No plans here, just thinking: if we do it, do it once.

The reason I wasn't thinking about the rewind part though was it seems
like someone might want to set up or tear down something at appropriate
times, so adding Start/End felt "obvious". Yes, lets have Rewind also.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] WIP: executor_hook for pg_stat_statements
Date: 2008-07-15 14:56:55
Message-ID: 19450.1216133815@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> Also, after looking at the patch more closely, was there a good reason
>>> for making the hook intercept ExecutePlan rather than ExecutorRun?
>>
>> That raises the question of whether we should have ExecutorStart() and
>> ExecutorEnd() hooks as well, to round things off.

> Yeah, and also ExecutorRewind() hook.

I'm not impressed by this line of argument. If we start putting in
hooks just because someone might need 'em someday, we'd soon end up with
hundreds or thousands of mostly-useless hooks. I'm happy to put in
hooks that there's a demonstrated need for, but I don't believe that
"replace the executor without touching the core code" is a sane goal.
Even if it were, the API of the executor to the rest of the system
is a whole lot wider than four functions.

regards, tom lane


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] WIP: executor_hook for pg_stat_statements
Date: 2008-07-16 02:49:13
Message-ID: 20080716104154.7860.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> >> That raises the question of whether we should have ExecutorStart() and
> >> ExecutorEnd() hooks as well, to round things off.
> > Yeah, and also ExecutorRewind() hook.
>
> I'm happy to put in hooks that there's a demonstrated need for,

Hmm, ok. I just want to hook ExecutorRun, so I'll just propose to
add ExecutorRun_hook now.

The attached patch is the proposal. It adds two global symbols:
* ExecutorRun_hook - replacing behavior of ExecutorRun()
* standard_ExecutorRun() - default behavior of ExecutorRun()

And also modifies one funtion:
* ExecuteQuery() - It passes prepared query's text to portal so that
the prepared query's text is available at the executor level.
This change is almost free because it copys only string pointer,
not the string buffer.

The attached archive pg_stat_statements.tar.gz is a demonstration of
ExecutorRun_hook. It collect per-statement statistics of number of planned
and executed, plan cost, execution time, and buffer gets/reads/writes.
I'll happy if the addin will be accepted as contrib module, but if it is
not suitable, I'm willing to move it to pgFoundry.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
ExecutorRun_hook.patch application/octet-stream 2.4 KB
pg_stat_statements.tar.gz application/octet-stream 7.6 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] WIP: executor_hook for pg_stat_statements
Date: 2008-07-18 20:42:36
Message-ID: 9775.1216413756@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> The attached patch is the proposal. It adds two global symbols:
> * ExecutorRun_hook - replacing behavior of ExecutorRun()
> * standard_ExecutorRun() - default behavior of ExecutorRun()

Applied.

> And also modifies one funtion:
> * ExecuteQuery() - It passes prepared query's text to portal so that
> the prepared query's text is available at the executor level.
> This change is almost free because it copys only string pointer,
> not the string buffer.

This patch is unsafe because the portal could outlive the cached plan
source (consider the case that a called function does a DEALLOCATE).
However, I don't see any compelling argument against doing a pstrdup
here. I did that and also went around and made assumptions uniform
about always having a source string for a cached plan or Portal.
So ActivePortal->sourceText should be a safe thing to consult to
see the source text of the most closely nested query being executed.
(Inside a plpgsql function, for instance, this would be the current
SQL statement of the function.)

> The attached archive pg_stat_statements.tar.gz is a demonstration of
> ExecutorRun_hook. It collect per-statement statistics of number of planned
> and executed, plan cost, execution time, and buffer gets/reads/writes.

I don't think this works yet --- you are still using debug_query_string,
and you're assuming it will be consistent with ActivePortal->sourceText,
which it won't be in function calls and other situations.

regards, tom lane