Lists: | pgsql-hackers |
---|
From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | RFD: Discarded tuple count for SeqScan nodes in EXPLAIN ANALYZE |
Date: | 2009-05-22 13:29:19 |
Message-ID: | 4A16A8AF.2080508@anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
When analyzing the plan of a query I often find myself questioning
whether an additional index may be sensible, or if it is sensible that a
SeqScan is used if an index is available.
The current EXPLAIN ANALYZE only shows the number of tuples matching the
qualifier of an SeqScan Node - for analyzing the above situation it is
at least equally interesting how many tuples were read and discarded.
Therefore I produced a patch which adds a 'discarded=%f' part to the
analyze output.
As this is only a RFD the implementation is a bit hackish at the moment
- the discarded counter is increased in execScan directly instead of a
helper routine in instrument.c.
Also the discarded count is displayed in other node types as well - for
some there might be a sensible semantic meaning to it...
Good idea - Bad idea?
Greetings,
Andres
From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: RFD: Discarded tuple count for SeqScan nodes in EXPLAIN ANALYZE - Patch v1 |
Date: | 2009-05-22 13:40:41 |
Message-ID: | 4A16AB59.9040101@anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
...
Attachment | Content-Type | Size |
---|---|---|
0001-Feature-discarded-tuple-count-display-in-EXPLAIN-ANA.patch | text/x-diff | 0 bytes |
From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: RFD: Discarded tuple count for SeqScan nodes in EXPLAIN ANALYZE |
Date: | 2009-05-22 13:42:28 |
Message-ID: | 4A16ABC4.8050507@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Andres Freund wrote:
>
> When analyzing the plan of a query I often find myself questioning
> whether an additional index may be sensible, or if it is sensible that a
> SeqScan is used if an index is available.
>
> The current EXPLAIN ANALYZE only shows the number of tuples matching the
> qualifier of an SeqScan Node - for analyzing the above situation it is
> at least equally interesting how many tuples were read and discarded.
>
> Therefore I produced a patch which adds a 'discarded=%f' part to the
> analyze output.
> As this is only a RFD the implementation is a bit hackish at the moment
> - the discarded counter is increased in execScan directly instead of a
> helper routine in instrument.c.
> Also the discarded count is displayed in other node types as well - for
> some there might be a sensible semantic meaning to it...
>
> Good idea - Bad idea?
Isn't the discarded count always equal to (# of rows in table - matched
tuples)? Seems pretty redundant to me.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: RFD: Discarded tuple count for SeqScan nodes in EXPLAIN ANALYZE |
Date: | 2009-05-22 13:48:21 |
Message-ID: | 4A16AD25.4090406@anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
On 05/22/2009 03:42 PM, Heikki Linnakangas wrote:
> Andres Freund wrote:
>> When analyzing the plan of a query I often find myself questioning
>> whether an additional index may be sensible, or if it is sensible that
>> a SeqScan is used if an index is available.
>>
>> The current EXPLAIN ANALYZE only shows the number of tuples matching
>> the qualifier of an SeqScan Node - for analyzing the above situation
>> it is at least equally interesting how many tuples were read and
>> discarded.
>> Good idea - Bad idea?
> Isn't the discarded count always equal to (# of rows in table - matched
> tuples)? Seems pretty redundant to me.
Not for EXISTS(), LIMIT and similar.
Also when looking at more complex plans its quite a nuisance to go
through all participating tables and do a separate count(*). Especially
its not your plan but some clients plan etc.
Andres
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: RFD: Discarded tuple count for SeqScan nodes in EXPLAIN ANALYZE |
Date: | 2009-05-22 15:54:21 |
Message-ID: | 26176.1243007661@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Andres Freund <andres(at)anarazel(dot)de> writes:
> On 05/22/2009 03:42 PM, Heikki Linnakangas wrote:
>> Isn't the discarded count always equal to (# of rows in table - matched
>> tuples)? Seems pretty redundant to me.
> Not for EXISTS(), LIMIT and similar.
It doesn't really seem useful enough to justify breaking client-side
code that looks at EXPLAIN output.
This sort of ties into the discussions we have periodically about
allowing EXPLAIN to output XML or some other more-machine-friendly
data format. The barrier for adding additional output fields would
be a lot lower in such a format.
regards, tom lane
From: | Greg Stark <stark(at)enterprisedb(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: RFD: Discarded tuple count for SeqScan nodes in EXPLAIN ANALYZE |
Date: | 2009-05-22 16:11:46 |
Message-ID: | 4136ffa0905220911jec20a34ha1830e2dd67e8643@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Fri, May 22, 2009 at 4:54 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> It doesn't really seem useful enough to justify breaking client-side
> code that looks at EXPLAIN output.
Fwiw at least pgadmin I don't think would be confused by this. These
tool authors aren't enamoured of fragile assumptions and the
maintenance headaches they cause either.
> This sort of ties into the discussions we have periodically about
> allowing EXPLAIN to output XML or some other more-machine-friendly
> data format. The barrier for adding additional output fields would
> be a lot lower in such a format.
This is still pretty much true if only for the sheer unscalability of
the amount of data being presented for users to sift through. I do
want us to add a ton more instrumentation into the explain plan and
this is only one small addition. If we add number of hard and soft
i/os, time spent in user and system space, etc the result would be
pretty unreadable and they're at least as important as things like
this.
--
greg
From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tom Raney <raneyt(at)cecs(dot)pdx(dot)edu> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: RFD: Discarded tuple count for SeqScan nodes in EXPLAIN ANALYZE |
Date: | 2009-05-23 00:37:14 |
Message-ID: | 4A17453A.2000807@anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
On 05/22/2009 05:54 PM, Tom Lane wrote:
> This sort of ties into the discussions we have periodically about
> allowing EXPLAIN to output XML or some other more-machine-friendly
> data format. The barrier for adding additional output fields would
> be a lot lower in such a format.
So the best thing would be to work on that front...
Tom (Raney), did you further work on your XML explain patch? Could you
use help?
Greetings,
Andres