Re: [WIP] showing index maintenance on EXPLAIN

Lists: pgsql-hackers
From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: [WIP] showing index maintenance on EXPLAIN
Date: 2014-05-08 00:00:44
Message-ID: CAJKUy5hJVXs_vHKsbJWMaJsLbpVaoHkT_YX7jM37kS3Z95EPKg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

This patch implements $subject only when ANALYZE and VERBOSE are on.
I made it that way because for years nobody seemed interested in this
info (at least no one did it) so i decided that maybe is to much
information for most people (actually btree indexes are normally very
fast).

But because we have GiST and GIN this became an interested piece of
data (there are other cases even when using btree).

Current patch doesn't have docs yet i will add them soon.

--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157

Attachment Content-Type Size
0001-Make-EXPLAIN-show-measurements-of-updating-indexes.patch text/x-patch 12.1 KB

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [WIP] showing index maintenance on EXPLAIN
Date: 2014-05-08 00:03:30
Message-ID: 536AC9D2.6080409@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 05/07/2014 05:00 PM, Jaime Casanova wrote:
> Hi,
>
> This patch implements $subject only when ANALYZE and VERBOSE are on.
> I made it that way because for years nobody seemed interested in this
> info (at least no one did it) so i decided that maybe is to much
> information for most people (actually btree indexes are normally very
> fast).

What's "index maintenance"?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] showing index maintenance on EXPLAIN
Date: 2014-05-08 00:04:57
Message-ID: CAJKUy5iwDpvR6z3PUVCMYA0W8PBmV3TZUMLbBmJCw1OkbR2vOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, May 7, 2014 at 7:03 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 05/07/2014 05:00 PM, Jaime Casanova wrote:
>> Hi,
>>
>> This patch implements $subject only when ANALYZE and VERBOSE are on.
>> I made it that way because for years nobody seemed interested in this
>> info (at least no one did it) so i decided that maybe is to much
>> information for most people (actually btree indexes are normally very
>> fast).
>
> What's "index maintenance"?
>

Maybe is not the right term... i mean: the time that take to update
indexes on an INSERT/UPDATE operation

--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] showing index maintenance on EXPLAIN
Date: 2014-05-08 03:52:02
Message-ID: CAA4eK1LkAMg-b6Ra5zYAYyc85771geQfrJzJPDBaxWciiV6Asg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 8, 2014 at 5:30 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> Hi,
>
> This patch implements $subject only when ANALYZE and VERBOSE are on.
> I made it that way because for years nobody seemed interested in this
> info (at least no one did it) so i decided that maybe is to much
> information for most people (actually btree indexes are normally very
> fast).

Why to capture only for Index Insert/Update and not for Read; is it
because Read will be always fast ot implementation complexity?

Why not similar timings for heap?

Why can't we print when only Analyze is used with Explain, the
execution time is printed with Analyze option?

Could you please tell in what all kind of scenario's, do you expect it
to be useful?
One I could think is that if there are multiple indexes on a table and user
wants to find out if any particular index is consuming more time.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] showing index maintenance on EXPLAIN
Date: 2014-05-08 06:31:22
Message-ID: CAJKUy5jsz1-h2CZQbNpKTg6=M_pFzZXQGkTasN=gXwz8hN5uEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, May 7, 2014 at 10:52 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Thu, May 8, 2014 at 5:30 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>> Hi,
>>
>> This patch implements $subject only when ANALYZE and VERBOSE are on.
>> I made it that way because for years nobody seemed interested in this
>> info (at least no one did it) so i decided that maybe is to much
>> information for most people (actually btree indexes are normally very
>> fast).
>
>
> Why to capture only for Index Insert/Update and not for Read; is it
> because Read will be always fast ot implementation complexity?
>

EXPLAIN ANALYZE already shows that on any SELECT that uses an index in
some way. Or are you thinking on something else?

> Why not similar timings for heap?
>

well "actual time" shows us total time of the operation so just
deducting the time spent on triggers, indexes and planning seems like
a way to get "heap modification time".

yes, maybe we still need some additional data. for example, i could
want to know how much time we spent extending a relation.

> Why can't we print when only Analyze is used with Explain, the
> execution time is printed with Analyze option?
>

i'm not sure the info is useful for everyone, i'm not opposed to show
it all the time though

> Could you please tell in what all kind of scenario's, do you expect it
> to be useful?
> One I could think is that if there are multiple indexes on a table and user
> wants to find out if any particular index is consuming more time.
>

exactly my use case. consider this plan (we spent 78% of the time
updating the index uniq_idx_on_text):

QUERY PLAN
--------------------------------------------------------------------------------------------
Insert on public.t1 (actual time=0.540..0.540 rows=0 loops=1)
-> Result (actual time=0.046..0.049 rows=1 loops=1)
Output: <some long data here>
Index uniq_idx_on_text on t1: time=0.421 rows=1
Index t1_pkey on t1: time=0.027 rows=1
Total runtime: 0.643 ms
(6 rows)

so i want to answer questions like, how much an index is hurting write
performance? once i know that i can look for alternative solutions.
In that vein, it was interesting to see how fastupdate affect
performance in a GIN index using gin_trgm_ops (5 times slower with
fastupdate=off)

(fastupdate=on) Index idx_ggin on t1: time=0.418 rows=1
(fastupdate=off) Index idx_ggin on t1: time=2.205 rows=1

this is not different to showing trigger time info, which we already do

--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] showing index maintenance on EXPLAIN
Date: 2014-05-08 15:41:44
Message-ID: CA+TgmobFcnXR3kN603kZidUK-dAzrycujwf2sD6Aiz3HpFVB9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 8, 2014 at 2:31 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Wed, May 7, 2014 at 10:52 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> On Thu, May 8, 2014 at 5:30 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>> Hi,
>>>
>>> This patch implements $subject only when ANALYZE and VERBOSE are on.
>>> I made it that way because for years nobody seemed interested in this
>>> info (at least no one did it) so i decided that maybe is to much
>>> information for most people (actually btree indexes are normally very
>>> fast).
>>
>>
>> Why to capture only for Index Insert/Update and not for Read; is it
>> because Read will be always fast ot implementation complexity?
>>
>
> EXPLAIN ANALYZE already shows that on any SELECT that uses an index in
> some way. Or are you thinking on something else?
>
>> Why not similar timings for heap?
>>
>
> well "actual time" shows us total time of the operation so just
> deducting the time spent on triggers, indexes and planning seems like
> a way to get "heap modification time".
>
> yes, maybe we still need some additional data. for example, i could
> want to know how much time we spent extending a relation.
>
>> Why can't we print when only Analyze is used with Explain, the
>> execution time is printed with Analyze option?
>>
>
> i'm not sure the info is useful for everyone, i'm not opposed to show
> it all the time though
>
>> Could you please tell in what all kind of scenario's, do you expect it
>> to be useful?
>> One I could think is that if there are multiple indexes on a table and user
>> wants to find out if any particular index is consuming more time.
>>
>
> exactly my use case. consider this plan (we spent 78% of the time
> updating the index uniq_idx_on_text):
>
> QUERY PLAN
> --------------------------------------------------------------------------------------------
> Insert on public.t1 (actual time=0.540..0.540 rows=0 loops=1)
> -> Result (actual time=0.046..0.049 rows=1 loops=1)
> Output: <some long data here>
> Index uniq_idx_on_text on t1: time=0.421 rows=1
> Index t1_pkey on t1: time=0.027 rows=1
> Total runtime: 0.643 ms
> (6 rows)

I would have expected the information about index maintenance times to
be associated with the Insert node, not the plan overall. IIUC, you
could have more than one such node if, for example, there are
writeable CTEs involved.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] showing index maintenance on EXPLAIN
Date: 2014-05-08 18:05:46
Message-ID: CAJKUy5hEe63zODbbaeUvk2YVr+eL-KErDk5xavK967NHmk3Lyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 8, 2014 at 10:41 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, May 8, 2014 at 2:31 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>> On Wed, May 7, 2014 at 10:52 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> On Thu, May 8, 2014 at 5:30 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>>> Hi,
>>>>
>>>> This patch implements $subject only when ANALYZE and VERBOSE are on.
>>>> I made it that way because for years nobody seemed interested in this
>>>> info (at least no one did it) so i decided that maybe is to much
>>>> information for most people (actually btree indexes are normally very
>>>> fast).
>>>
>>>
> I would have expected the information about index maintenance times to
> be associated with the Insert node, not the plan overall. IIUC, you
> could have more than one such node if, for example, there are
> writeable CTEs involved.
>

i followed how trigger info is showed in explain. I can try to change
it, if that is what most people prefer.

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Insert on pgbench_accounts (actual time=0.249..0.249 rows=0 loops=1)
CTE results
-> Insert on pgbench_accounts pgbench_accounts_1 (actual
time=0.152..0.159 rows=1 loops=1)
-> Result (actual time=0.003..0.004 rows=1 loops=1)
-> CTE Scan on results (actual time=0.165..0.174 rows=1 loops=1)
Trigger trg1 on pgbench_accounts: time=0.033 calls=1
Trigger trg1 on pgbench_accounts: time=0.059 calls=1
Total runtime: 0.377 ms
(8 rows)

--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] showing index maintenance on EXPLAIN
Date: 2014-05-09 03:44:19
Message-ID: CAA4eK1K8vHrPo6WuNktGeTmCX_Aq47yVdxRfv1Y-VThodpJ+2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 8, 2014 at 12:01 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Wed, May 7, 2014 at 10:52 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>> Why to capture only for Index Insert/Update and not for Read; is it
>> because Read will be always fast ot implementation complexity?
>>
>
> EXPLAIN ANALYZE already shows that on any SELECT that uses an index in
> some way. Or are you thinking on something else?

postgres=# explain analyze select * from t1 where c1 > 50000 and c1 <60000;
QUERY PLAN

--------------------------------------------------------------------------------
----------------------------------------
Index Scan using idx1_t1 on t1 (cost=0.29..983.57 rows=10014 width=508) (actua
l time=0.033..11.826 rows=9999 loops=1)
Index Cond: ((c1 > 50000) AND (c1 < 60000))
Planning time: 2.001 ms
Execution time: 18.486 ms
(4 rows)

Are you referring actual time in above print?

The actual time is node execution time which in above kind of cases will
be: scanning the index + scanning the heap. I think it is not same what
you are planning to show for Insert/Update case.

>> Why not similar timings for heap?
>>
>
> well "actual time" shows us total time of the operation so just
> deducting the time spent on triggers, indexes and planning seems like
> a way to get "heap modification time".

planning time doesn't include parse time, so above calculation might
not give time spent in heap during statement execution.

>> Why can't we print when only Analyze is used with Explain, the
>> execution time is printed with Analyze option?
>
> i'm not sure the info is useful for everyone, i'm not opposed to show
> it all the time though

Okay, no problem. I think it can be done based on what most people
expect.

>> Could you please tell in what all kind of scenario's, do you expect it
>> to be useful?
>> One I could think is that if there are multiple indexes on a table and user
>> wants to find out if any particular index is consuming more time.
>>
>
> exactly my use case. consider this plan (we spent 78% of the time
> updating the index uniq_idx_on_text):

I think this will be useful for such cases.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [WIP] showing index maintenance on EXPLAIN
Date: 2014-05-09 04:04:44
Message-ID: 1399608284088-5803337.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas wrote
> On Thu, May 8, 2014 at 2:31 AM, Jaime Casanova &lt;

> jaime@

> &gt; wrote:
>> On Wed, May 7, 2014 at 10:52 PM, Amit Kapila &lt;

> amit.kapila16@

> &gt; wrote:
>>> On Thu, May 8, 2014 at 5:30 AM, Jaime Casanova &lt;

> jaime@

> &gt; wrote:
>>
>> QUERY PLAN
>> --------------------------------------------------------------------------------------------
>> Insert on public.t1 (actual time=0.540..0.540 rows=0 loops=1)
>> -> Result (actual time=0.046..0.049 rows=1 loops=1)
>> Output:
> <some long data here>
>> Index uniq_idx_on_text on t1: time=0.421 rows=1
>> Index t1_pkey on t1: time=0.027 rows=1
>> Total runtime: 0.643 ms
>> (6 rows)
>
> I would have expected the information about index maintenance times to
> be associated with the Insert node, not the plan overall. IIUC, you
> could have more than one such node if, for example, there are
> writeable CTEs involved.

Even with multiple nodes I would think that typically each node would focus
on a different table and so listing everything, by table, at the bottom of
the explain would not cause that much of an issue. Even if a table gets hit
from two CTEs, and thus have their calls and times added together, the
relative (per call) numbers overall and comparative numbers between indexes
on the same table would remain constant. And if you are testing indexes you
would likely want to keep the queries the same.

Not having to scan through and handle multiple locations for index timings
will make interpretation, parsing, and presentation much easier. It
wouldn't preclude having total timings and node timings in the future if the
demand is sufficient.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/WIP-showing-index-maintenance-on-EXPLAIN-tp5803106p5803337.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] showing index maintenance on EXPLAIN
Date: 2014-05-09 05:54:02
Message-ID: 536C6D7A.5020301@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 05/08/2014 08:04 AM, Jaime Casanova wrote:
> Maybe is not the right term... i mean: the time that take to update
> indexes on an INSERT/UPDATE operation

That'd be a big plus IMO, especially for expensive GiST or GIN index
updates.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Thom Brown <thom(at)linux(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] showing index maintenance on EXPLAIN
Date: 2014-05-09 08:15:07
Message-ID: CAA-aLv579uDOhnC1477kYqAVd1jyP=6MBsG+jyXoReT6jkkV2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 May 2014 01:00, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:

> Hi,
>
> This patch implements $subject only when ANALYZE and VERBOSE are on.
> I made it that way because for years nobody seemed interested in this
> info (at least no one did it) so i decided that maybe is to much
> information for most people (actually btree indexes are normally very
> fast).
>
> But because we have GiST and GIN this became an interested piece of
> data (there are other cases even when using btree).
>
> Current patch doesn't have docs yet i will add them soon.
>

+1

I've asked for something like this previously:
http://www.postgresql.org/message-id/CAA-aLv6O7QKdtZvacDJ+R_YjuLGCQTGzj6JXZTPmCnnxroX=7g@mail.gmail.com

--
Thom


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] showing index maintenance on EXPLAIN
Date: 2014-05-09 18:03:35
Message-ID: CAJKUy5hmG-jgsb-iZ0fekn70siQdh+XRNkg4cXBB9yUb6wuhyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 8, 2014 at 10:44 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Thu, May 8, 2014 at 12:01 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>> On Wed, May 7, 2014 at 10:52 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>>
>>> Why to capture only for Index Insert/Update and not for Read; is it
>>> because Read will be always fast ot implementation complexity?
>>>
>> EXPLAIN ANALYZE already shows that on any SELECT that uses an index in
>> some way. Or are you thinking on something else?
>
[...]
>
> Are you referring actual time in above print?
>
> The actual time is node execution time which in above kind of cases will
> be: scanning the index + scanning the heap. I think it is not same what
> you are planning to show for Insert/Update case.
>

ah! good point! my current case is because of write performance, but
will look at it a little

--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157


From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] showing index maintenance on EXPLAIN
Date: 2014-06-20 14:13:01
Message-ID: CAJrrPGdMFRzgb_ZYDYzoR6Xdh8hNfEV0pVWj9W5GkPzuQkCC4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 8, 2014 at 10:00 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> Hi,
>
> This patch implements $subject only when ANALYZE and VERBOSE are on.
> I made it that way because for years nobody seemed interested in this
> info (at least no one did it) so i decided that maybe is to much
> information for most people (actually btree indexes are normally very
> fast).
>
> But because we have GiST and GIN this became an interested piece of
> data (there are other cases even when using btree).
>
> Current patch doesn't have docs yet i will add them soon.

This patch provides the Insertion time of an index operation.
This information is useful to the administrator for reorganizing the indexes
based on the insertion time.

Quick review:

Applies to Head.
Regress test is passed.
Coding is fine.

Minor comments:

There is no need of printing the index insertion time as zero in case
of hot update operations.
Please correct the same.

Add the documentation changes.

Regards,
Hari Babu
Fujitsu Australia