Re: pg_stat_statements and planning time

Lists: pgsql-hackers
From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: pg_stat_statements and planning time
Date: 2012-03-07 11:45:52
Message-ID: CAHGQGwFx_=DO-Gu-MfPW3VQ4qC7TfVdH2zHmvZfrGv6fQ3D-Tw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

pg_stat_statements is basically very helpful to find out slow queries.
But since it doesn't report the time spent in the planner, we cannot
find out slow queries which take most time to do query planning, from
pg_stat_statements. Is there any reason why pg_stat_statements doesn't
collect the planning time?

Attached patch extends pg_stat_statements so that it reports the
planning time. Thought?

In the patch, I didn't change the column name "total_time" meaning
the time spent in the executor because of the backward compatibility.
But once new column "plan_time" is added, "total_time" is confusing and
ISTM it should be renamed...

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
pg_stat_statements_plantime_v1.patch text/x-diff 11.0 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements and planning time
Date: 2012-03-07 12:00:12
Message-ID: CA+U5nM+GzVaEXfZMRVHEj_5qSfsxwOuK+dzhOnDeSOoTBU5FjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 7, 2012 at 11:45 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:

> Attached patch extends pg_stat_statements so that it reports the
> planning time. Thought?

If we successfully aggregate SQL in the current patch then this might
be useful as well. Until we do that it's not much use.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements and planning time
Date: 2012-03-07 12:18:54
Message-ID: CAHGQGwGzNtu+_s-KC3edLRT7=hMQ2gwSOZv7wk8k_64QYtiS0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 7, 2012 at 9:00 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Wed, Mar 7, 2012 at 11:45 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>
>> Attached patch extends pg_stat_statements so that it reports the
>> planning time. Thought?
>
> If we successfully aggregate SQL in the current patch then this might
> be useful as well. Until we do that it's not much use.

You mean pg_stat_statements normalization patch? Yes, if it will be applied,
this patch would be more useful. But without the normalization patch, this patch
would be useful for a user who use the extended protocol but not plan cache
mechanism, I think.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements and planning time
Date: 2012-03-07 15:39:40
Message-ID: CA+TgmoY1HFaX9wqwMRKpsYJheYGzpY=7D2dEnz6R7MpZYOjNMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 7, 2012 at 6:45 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> pg_stat_statements is basically very helpful to find out slow queries.
> But since it doesn't report the time spent in the planner, we cannot
> find out slow queries which take most time to do query planning, from
> pg_stat_statements. Is there any reason why pg_stat_statements doesn't
> collect the planning time?
>
> Attached patch extends pg_stat_statements so that it reports the
> planning time. Thought?

I think this is an interesting idea, but I think it's too late for 9.2.

I'd like to have the planning time in a number of other places as
well, such as EXPLAIN, and maybe statistics views.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements and planning time
Date: 2012-03-07 16:07:08
Message-ID: 8309.1331136428@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Fujii Masao <masao(dot)fujii(at)gmail(dot)com> writes:
> In the patch, I didn't change the column name "total_time" meaning
> the time spent in the executor because of the backward compatibility.
> But once new column "plan_time" is added, "total_time" is confusing and
> ISTM it should be renamed...

Well, if we were tracking planning time, what I would expect
"total_time" to mean is plan time plus execution time. Should it be
redefined that way, instead of renaming it?

Another point here is that because of plan caching, the number of
planner invocations could be quite different from the number of executor
runs. It's not clear to me whether this will confuse matters for
pg_stat_statements, but it's something to think about. Will it be
possible to tell whether a particular statement is hugely expensive to
plan but we don't do that often, versus cheap to plan but we do that a
lot? IOW I am wondering if we need to track the number of invocations
as well as total time.

regards, tom lane


From: Daniel Farina <daniel(at)heroku(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements and planning time
Date: 2012-03-07 18:53:36
Message-ID: CAAZKuFYo93RZn5FbfMB3RrHUiLDcJrgUJk_ZTfsOuXN6ftNvZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 7, 2012 at 8:07 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Fujii Masao <masao(dot)fujii(at)gmail(dot)com> writes:
>> In the patch, I didn't change the column name "total_time" meaning
>> the time spent in the executor because of the backward compatibility.
>> But once new column "plan_time" is added, "total_time" is confusing and
>> ISTM it should be renamed...
>
> Well, if we were tracking planning time, what I would expect
> "total_time" to mean is plan time plus execution time.  Should it be
> redefined that way, instead of renaming it?

I think you are right.

On first glance, a user would only be interested in one number, the
total time. Most people are hunting queries that happen
(surprisingly) frequently and don't have good indexes in place to
serve them, and in those cases planning time is negligible. However,
should planning time be the bottleneck then one has to investigate
fewer, smaller queries, a very different solution space.

> Another point here is that because of plan caching, the number of
> planner invocations could be quite different from the number of executor
> runs.  It's not clear to me whether this will confuse matters for
> pg_stat_statements, but it's something to think about.  Will it be
> possible to tell whether a particular statement is hugely expensive to
> plan but we don't do that often, versus cheap to plan but we do that a
> lot?  IOW I am wondering if we need to track the number of invocations
> as well as total time.

I don't think tracking a few more words will do much harm, and could
also do a lot of good....perhaps count, sum, sum(x**2) of planning and
execution?

(Few things to me are more annoying than an unadorned average when it
would have been feasible to take a square of a number).

Thoughts?

--
fdr


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements and planning time
Date: 2012-03-08 02:59:20
Message-ID: CAHGQGwG0zzq0MRH+xnxa8YZDtEEeWo44eRLBhpPMY1KtS2Ssgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Mar 8, 2012 at 12:39 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Mar 7, 2012 at 6:45 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> pg_stat_statements is basically very helpful to find out slow queries.
>> But since it doesn't report the time spent in the planner, we cannot
>> find out slow queries which take most time to do query planning, from
>> pg_stat_statements. Is there any reason why pg_stat_statements doesn't
>> collect the planning time?
>>
>> Attached patch extends pg_stat_statements so that it reports the
>> planning time. Thought?
>
> I think this is an interesting idea, but I think it's too late for 9.2.

Yes. I will add this to the next commitfest. And, in the patch, I changed
pg_stat_statements--1.1.sql, but, for 9.3 I will have to create 1.2.sql instead.

> I'd like to have the planning time in a number of other places as
> well, such as EXPLAIN, and maybe statistics views.

+1 for EXPLAIN. But which statistics views are in your mind?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements and planning time
Date: 2012-03-08 03:29:29
Message-ID: CAHGQGwGmMu+HUywe01LieYfe4pensQ=wayezGK+aWMYbZ4Rgsg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Mar 8, 2012 at 1:07 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Fujii Masao <masao(dot)fujii(at)gmail(dot)com> writes:
>> In the patch, I didn't change the column name "total_time" meaning
>> the time spent in the executor because of the backward compatibility.
>> But once new column "plan_time" is added, "total_time" is confusing and
>> ISTM it should be renamed...
>
> Well, if we were tracking planning time, what I would expect
> "total_time" to mean is plan time plus execution time.  Should it be
> redefined that way, instead of renaming it?

Agreed, it's more intuitive for a user.

Along with "total_time" and "plan_time", should we also define "exec_time"
reporting only the execution time for improvement of usability though it can
be calculated from "total_time" and "plan_time"?

> Another point here is that because of plan caching, the number of
> planner invocations could be quite different from the number of executor
> runs.  It's not clear to me whether this will confuse matters for
> pg_stat_statements, but it's something to think about.  Will it be
> possible to tell whether a particular statement is hugely expensive to
> plan but we don't do that often, versus cheap to plan but we do that a
> lot?  IOW I am wondering if we need to track the number of invocations
> as well as total time.

Agreed to add something like "plan_count" column. This also would be helpful
for e.g., tuning the prepareThreshold parameter in JDBC.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements and planning time
Date: 2012-03-08 13:09:54
Message-ID: CA+Tgmobq3BVLghTu9XBjkxQ6adyfP+ppv+WKwybE+sL7cH9Wpg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 7, 2012 at 9:59 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> I'd like to have the planning time in a number of other places as
>> well, such as EXPLAIN, and maybe statistics views.
>
> +1 for EXPLAIN. But which statistics views are in your mind?

I don't know. I'm not sure if it's interesting to be able to count
planning time vs. execution time on a server-wide basis, or even a
per-database basis, but it seems like it might be, if we can do it
cheaply. Then again, considering that gettimeofday is kinda
expensive, I suppose that would have to be optional if we were to have
it at all. Just thinking out loud, mostly.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements and planning time
Date: 2012-03-08 13:31:32
Message-ID: CAEYLb_URfq8yVScaZY2KtVap9=jw_zFMeZXMMswBDY9MoRU99Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 March 2012 13:09, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Then again, considering that gettimeofday is kinda
> expensive, I suppose that would have to be optional if we were to have
> it at all.

+1. I'm not opposed to having such a mechanism, but it really ought to
impose exactly no overhead on the common case where we don't
particularly care about plan time.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements and planning time
Date: 2012-03-08 14:44:06
Message-ID: 10436.1331217846@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Geoghegan <peter(at)2ndquadrant(dot)com> writes:
> On 8 March 2012 13:09, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Then again, considering that gettimeofday is kinda
>> expensive, I suppose that would have to be optional if we were to have
>> it at all.

> +1. I'm not opposed to having such a mechanism, but it really ought to
> impose exactly no overhead on the common case where we don't
> particularly care about plan time.

I thought the proposal was to add it to (1) pg_stat_statement and (2)
EXPLAIN, both of which are not in the normal code execution path.
pg_stat_statement is already a drag on a machine with slow gettimeofday,
but it's not clear why users of it would think that two gettimeofday's
per query are acceptable and four are not.

regards, tom lane


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements and planning time
Date: 2012-03-08 16:42:58
Message-ID: CAEYLb_WSr0eo+vOTcT7LikMfvc3p-x-BNRLFLj-tNJVuTGJBdA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 March 2012 14:44, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I thought the proposal was to add it to (1) pg_stat_statement and (2)
> EXPLAIN, both of which are not in the normal code execution path.
> pg_stat_statement is already a drag on a machine with slow gettimeofday,
> but it's not clear why users of it would think that two gettimeofday's
> per query are acceptable and four are not.

To be clear, I don't see any reasons to not just have it within
EXPLAIN output under all circumstances.

pg_stat_statements will slow down query execution, but I see no reason
to force users to pay for something that they may well not want by not
including an 'off' switch for this additional instrumentation, given
that it doubles the number of gettimeofdays. I'm not particularly
concerned about platforms with slow gettimeofdays. I'm concerned with
keeping the overhead of running pg_stat_statements as low as possible
generally.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services