REVIEW: EXPLAIN and nfiltered

Lists: pgsql-hackers
From: Stephen Frost <sfrost(at)snowman(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Subject: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 03:16:51
Message-ID: 20110120031651.GH30352@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greetings,

On 2010-01-15 11:37 PM +200, Marko Tiikkaja wrote:
> On 2010-11-18 5:45 PM +0200, Marko Tiikkaja wrote:
> > Here's a patch for showing in EXPLAIN ANALYZE the number of rows a plan
> > qual filtered from a node's input.
>
> Rebased against master.

This patch looked good, in general, to me. I added a few documentation
updates and a comment, but it's a very straight-forward patch as far as
I can tell. Passes all regressions and my additional testing.

commit fac899f7967ce74e14a90af9ca24e1a1f5a580e7
Author: Stephen Frost <sfrost(at)snowman(dot)net>
Date: Wed Jan 19 22:14:54 2011 -0500

Fix < & > in docs to be &lt; &gt;, as required.

commit 5fcdb75a646912b8b273703caf33dadb80122e1c
Author: Stephen Frost <sfrost(at)snowman(dot)net>
Date: Wed Jan 19 22:05:05 2011 -0500

Update documentation for EXPLAIN ANALYZE/nfiltered

This patch updates some documentation around EXPLAIN ANALYZE, whose
output has been changed by the patch which added nfiltered to it.

Also added a comment in the only place that seemed to need one.

commit 9ebb0108a217c2d3b7f815d1d902d6bdcc276104
Author: Stephen Frost <sfrost(at)snowman(dot)net>
Date: Wed Jan 19 21:33:28 2011 -0500

Add nfiltered in EXPLAIN ANALYZE

This patch add the number of rows a plan qual filtered from a node's
input to the EXPLAIN ANALYZE output.

Patch by: Marko Tiikkaja

Thanks,

Stephen

Attachment Content-Type Size
explain_nfiltered.patch text/x-diff 6.3 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 04:03:03
Message-ID: AANLkTinw9ZqtVBp8pRGR4rRFKQH72Lj8UycHh5mSxYLV@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 19, 2011 at 10:16 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> This patch looked good, in general, to me.  I added a few documentation
> updates and a comment, but it's a very straight-forward patch as far as
> I can tell.  Passes all regressions and my additional testing.

I have not looked at the code for this patch at all as yet, but just
as a general user comment - I really, really want this. It's one of
about, uh, two pieces of information that the EXPLAIN output doesn't
give you that is incredibly important for troubleshooting.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
To: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Cc: pgsql-hackers(at)postgresql(dot)org, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 06:37:12
Message-ID: AANLkTinGC0TcgfMw7+kYWHZxEpfpAhxOpyd+mVf1Kwkf@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 20, 2011 at 12:16, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> This patch looked good, in general, to me.  I added a few documentation
> updates and a comment, but it's a very straight-forward patch as far as
> I can tell.  Passes all regressions and my additional testing.

Looks good and useful for me, too.

We need to adjust a bit more documentation. The patch changes all of
EXPLAIN ANALYZE outputs. When I grep'ed the docs with "loops=",
EXPLAIN ANALYZE is also used in perform.sgml and auto-explain.sgml
in addition to explain.sgml.

It's good to have documentation about "nfiltered" parameter. The best
place would be around the descriptions of "loops" in "Using EXPLAIN" page:

http://developer.postgresql.org/pgdocs/postgres/using-explain.html

--
Itagaki Takahiro


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers(at)postgresql(dot)org, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 16:10:39
Message-ID: 14894.1295539839@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Wed, Jan 19, 2011 at 10:16 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> This patch looked good, in general, to me. I added a few documentation
>> updates and a comment, but it's a very straight-forward patch as far as
>> I can tell. Passes all regressions and my additional testing.

> I have not looked at the code for this patch at all as yet, but just
> as a general user comment - I really, really want this. It's one of
> about, uh, two pieces of information that the EXPLAIN output doesn't
> give you that is incredibly important for troubleshooting.

What's the other one?

The main problem I've got with this patch is that there's no place to
shoehorn the information into the textual EXPLAIN format without
breaking a lot of expectations (and hence code --- it's insane to
imagine that any significant amount of client-side code has been
rewritten to make use of xml/json output yet). It would be nice to know
what other requests are likely to be coming down the pike before we
decide exactly how we're going to break things.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers(at)postgresql(dot)org, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 16:20:12
Message-ID: AANLkTinNNo7tGf1LkCK9AecyfnuGbF=PMDoJLTEtXW9e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 20, 2011 at 11:10 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Wed, Jan 19, 2011 at 10:16 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>>> This patch looked good, in general, to me.  I added a few documentation
>>> updates and a comment, but it's a very straight-forward patch as far as
>>> I can tell.  Passes all regressions and my additional testing.
>
>> I have not looked at the code for this patch at all as yet, but just
>> as a general user comment - I really, really want this.  It's one of
>> about, uh, two pieces of information that the EXPLAIN output doesn't
>> give you that is incredibly important for troubleshooting.
>
> What's the other one?

In the following sort of plan:

rhaas=# explain analyze select * from bob b, sally s where b.a = s.a;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..117890.00 rows=1000 width=8) (actual
time=0.036..533.372 rows=1000 loops=1)
-> Seq Scan on sally s (cost=0.00..5770.00 rows=400000 width=4)
(actual time=0.014..46.469 rows=400000 loops=1)
-> Index Scan using bob_pkey on bob b (cost=0.00..0.27 rows=1
width=4) (actual time=0.001..0.001 rows=0 loops=400000)
Index Cond: (a = s.a)
Total runtime: 533.935 ms
(5 rows)

...you cannot really tell how many rows the index scan was expected to
match, or actually did match. The answer to the latter question
certainly isn't 0. We previously discussed making the rows= line go
out to three decimal places when used in an inner-index-scan context,
which would help a lot - you could then multiply rows by loops to get
an approximate answer. My preferred fix would be just to remove the
unhelpful division-by-nloops code that gets applied in this case, but
that's less backward-compatible.

> The main problem I've got with this patch is that there's no place to
> shoehorn the information into the textual EXPLAIN format without
> breaking a lot of expectations (and hence code --- it's insane to
> imagine that any significant amount of client-side code has been
> rewritten to make use of xml/json output yet).  It would be nice to know
> what other requests are likely to be coming down the pike before we
> decide exactly how we're going to break things.

It's hard to predict the nature of future feature requests, but this
and the above are at the top of my list of ongoing gripes, and there
isn't a close third.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 16:55:33
Message-ID: 20110120165533.GM30352@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> The main problem I've got with this patch is that there's no place to
> shoehorn the information into the textual EXPLAIN format without
> breaking a lot of expectations (and hence code --- it's insane to
> imagine that any significant amount of client-side code has been
> rewritten to make use of xml/json output yet). It would be nice to know
> what other requests are likely to be coming down the pike before we
> decide exactly how we're going to break things.

While I agree completely about the general "if you're going to break,
break it big" approach, but I don't particularly care for holding output
strings from EXPLAIN to the same level that we do the wireline protocol.
This is going into a new major version, not something which is being
back-patched, and users now have a way in a released version to get away
from relying on the string output.

Have we worried about adding new plan nodes due to breakage in the
explain output..? It strikes me that we've actually changed it with
some regularity, in one aspect or another, over a couple of releases.
Maybe my memory is bad though.

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 17:07:36
Message-ID: AANLkTimx=diaDvNHmMaRupQkxARezASYmSUJe1WLEfAS@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 20, 2011 at 11:55 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
>> The main problem I've got with this patch is that there's no place to
>> shoehorn the information into the textual EXPLAIN format without
>> breaking a lot of expectations (and hence code --- it's insane to
>> imagine that any significant amount of client-side code has been
>> rewritten to make use of xml/json output yet).  It would be nice to know
>> what other requests are likely to be coming down the pike before we
>> decide exactly how we're going to break things.
>
> While I agree completely about the general "if you're going to break,
> break it big" approach, but I don't particularly care for holding output
> strings from EXPLAIN to the same level that we do the wireline protocol.
> This is going into a new major version, not something which is being
> back-patched, and users now have a way in a released version to get away
> from relying on the string output.

I agree; we make bigger changes than this all the time. At the risk
of putting words in Tom's mouth, I think part of the concern here may
be that the EXPLAIN output is quite verbose already, and adding a few
more details is going to make it harder to read in the cases where you
*don't* care about this additional information. That's a valid
concern, but I don't know what to do about it - not having this
information available isn't better. It's tempting to propose moving
the "actual" numbers down to the next line, so that the lines aren't
so ridiculously long.

Looking at the patch, I have to say I had hoped this was going to show
nfiltered in both the estimated and actual cases, which it doesn't.
Now maybe that's more work than we want to put in, but it would be
nice to have.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 17:19:59
Message-ID: 4D386EBF.60509@cs.helsinki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2011-01-20 7:07 PM +0200, Robert Haas wrote:
> Looking at the patch, I have to say I had hoped this was going to show
> nfiltered in both the estimated and actual cases, which it doesn't.
> Now maybe that's more work than we want to put in, but it would be
> nice to have.

That would be fantastical, if we can make it happen.

Regards,
Marko Tiikkaja


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers(at)postgresql(dot)org, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, depesz(at)depesz(dot)com
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 17:41:26
Message-ID: 17628.1295545286@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Jan 20, 2011 at 11:55 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> While I agree completely about the general "if you're going to break,
>> break it big" approach, but I don't particularly care for holding output
>> strings from EXPLAIN to the same level that we do the wireline protocol.

> I agree; we make bigger changes than this all the time.

No, we don't. It's true that a client that wants to truly *understand*
the plan has to know a lot of things, but the fundamental format of
EXPLAIN ANALYZE output has been real stable for a real long time:

node name (cost=xxx.xx..xxx.xx rows=xxx width=xxx) (actual time=xxx.xxx..xxx.xxx rows=xxx loops=xxx)
detail line: something or other
-> subnode name ... more of the same ...

This level of understanding seems plenty sufficient for something like
explain.depesz.com, to name just one popular tool. The last format
change of any kind we made in this skeleton was to increase the number
of decimal places in the "actual time" numbers from 2 to 3 (wow).
That was in 7.4. Modulo that detail, this basic contract has been valid
since EXPLAIN ANALYZE was invented, in 7.2. As proposed, this patch
will break it.

It might be interesting for somebody to go look at Hubert's code and see
just how much it really knows about the EXPLAIN output format, and how
much it's had to change across PG releases.

regards, tom lane


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: depesz(at)depesz(dot)com, Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 17:57:38
Message-ID: AANLkTi=riCv8bO=N=gQgycUnsq+V--3NUFmnW9b9FxTA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jan 20, 2011 6:43 PM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Thu, Jan 20, 2011 at 11:55 AM, Stephen Frost <sfrost(at)snowman(dot)net>
wrote:
> >> While I agree completely about the general "if you're going to break,
> >> break it big" approach, but I don't particularly care for holding
output
> >> strings from EXPLAIN to the same level that we do the wireline
protocol.
>
> > I agree; we make bigger changes than this all the time.
>
> No, we don't. It's true that a client that wants to truly *understand*
> the plan has to know a lot of things, but the fundamental format of
> EXPLAIN ANALYZE output has been real stable for a real long time:
>
> node name (cost=xxx.xx..xxx.xx rows=xxx width=xxx) (actual
time=xxx.xxx..xxx.xxx rows=xxx loops=xxx)
> detail line: something or other
> -> subnode name ... more of the same ...
>
> This level of understanding seems plenty sufficient for something like
> explain.depesz.com, to name just one popular tool. The last format
> change of any kind we made in this skeleton was to increase the number
> of decimal places in the "actual time" numbers from 2 to 3 (wow).
> That was in 7.4. Modulo that detail, this basic contract has been valid
> since EXPLAIN ANALYZE was invented, in 7.2. As proposed, this patch
> will break it.
>
> It might be interesting for somebody to go look at Hubert's code and see
> just how much it really knows about the EXPLAIN output format, and how
> much it's had to change across PG releases.
>

Haven't looked at what changes with this patch, but dont forget PgAdmin that
also parses the output. Though if the format changes enough to affect it,
that might be the driving force to have it use xml format instead, which is
the one that is intended for machine parsing after all..

/Magnus


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, depesz(at)depesz(dot)com, Stephen Frost <sfrost(at)snowman(dot)net>, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 17:58:35
Message-ID: AANLkTinpxhYuUSVKPRLBd9viUomZ2edM5evDEmX5O3ZC@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 20, 2011 at 12:57 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>
> On Jan 20, 2011 6:43 PM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> > On Thu, Jan 20, 2011 at 11:55 AM, Stephen Frost <sfrost(at)snowman(dot)net>
>> > wrote:
>> >> While I agree completely about the general "if you're going to break,
>> >> break it big" approach, but I don't particularly care for holding
>> >> output
>> >> strings from EXPLAIN to the same level that we do the wireline
>> >> protocol.
>>
>> > I agree; we make bigger changes than this all the time.
>>
>> No, we don't.  It's true that a client that wants to truly *understand*
>> the plan has to know a lot of things, but the fundamental format of
>> EXPLAIN ANALYZE output has been real stable for a real long time:
>>
>>  node name  (cost=xxx.xx..xxx.xx rows=xxx width=xxx) (actual
>> time=xxx.xxx..xxx.xxx rows=xxx loops=xxx)
>>   detail line: something or other
>>   ->  subnode name  ... more of the same ...
>>
>> This level of understanding seems plenty sufficient for something like
>> explain.depesz.com, to name just one popular tool.  The last format
>> change of any kind we made in this skeleton was to increase the number
>> of decimal places in the "actual time" numbers from 2 to 3 (wow).
>> That was in 7.4.  Modulo that detail, this basic contract has been valid
>> since EXPLAIN ANALYZE was invented, in 7.2.  As proposed, this patch
>> will break it.
>>
>> It might be interesting for somebody to go look at Hubert's code and see
>> just how much it really knows about the EXPLAIN output format, and how
>> much it's had to change across PG releases.
>>
>
> Haven't looked at what changes with this patch, but dont forget PgAdmin that
> also parses the output. Though if the format changes enough to affect it,
> that might be the driving force to have it use xml format instead, which is
> the one that is intended for machine parsing after all..

How much has that code been updated from one release to the next?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, depesz(at)depesz(dot)com
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 18:47:52
Message-ID: 20110120184752.GN30352@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > I agree; we make bigger changes than this all the time.
>
> No, we don't.

Alright, do we want to go down the road of adding new things to the
XML/JSON/YAML/Whatever-else format that isn't displayed in the TEXT
version, to avoid this concern?

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, depesz(at)depesz(dot)com, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 19:48:59
Message-ID: 20110120194859.GO30352@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> How much has that code been updated from one release to the next?

Just an FYI, I talked to depesz on IRC (please chime in if you disagree
with any of this) and he indicated that he's had to update the code
from time to time, mostly because the parser was too strict.

He also mentioned that he didn't feel it was terribly complicated or
that it'd be difficult to update for this. Looking over the code, it's
got a simple regex for matching that line which would have to be
updated, but I don't think it'd require much more than that.

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, depesz(at)depesz(dot)com
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 19:50:01
Message-ID: AANLkTimE7kvZWR2cYGSGUJibz4KXNaYRj_cMXBBoEhLp@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 20, 2011 at 1:47 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> > I agree; we make bigger changes than this all the time.
>>
>> No, we don't.
>
> Alright, do we want to go down the road of adding new things to the
> XML/JSON/YAML/Whatever-else format that isn't displayed in the TEXT
> version, to avoid this concern?

No, because, for one thing, the text output is what people are going
to send me when they want me to fix their crap. If the information
isn't there, I lose. And no, I don't want them to send me the XML.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: hubert depesz lubaczewski <depesz(at)depesz(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 19:53:06
Message-ID: 20110120195306.GA20864@depesz.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 20, 2011 at 02:48:59PM -0500, Stephen Frost wrote:
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> > How much has that code been updated from one release to the next?
>
> Just an FYI, I talked to depesz on IRC (please chime in if you disagree
> with any of this) and he indicated that he's had to update the code
> from time to time, mostly because the parser was too strict.
>
> He also mentioned that he didn't feel it was terribly complicated or
> that it'd be difficult to update for this. Looking over the code, it's
> got a simple regex for matching that line which would have to be
> updated, but I don't think it'd require much more than that.

i'll be happy to update the Pg::Explain to handle new elements of
textual plans, so if this would be of concern - please don't treat
"compatibility with explain.depesz.com" as your responsibility/problem.

I'll fix the parser (have to add json/xml parsing too anyway), and I,
too, would love to get more information.

Best regards,

depesz

--
Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/
jid/gtalk: depesz(at)depesz(dot)com / aim:depeszhdl / skype:depesz_hdl / gg:6749007


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: depesz(at)depesz(dot)com
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 20:47:52
Message-ID: 20606.1295556472@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

hubert depesz lubaczewski <depesz(at)depesz(dot)com> writes:
> On Thu, Jan 20, 2011 at 02:48:59PM -0500, Stephen Frost wrote:
>> He also mentioned that he didn't feel it was terribly complicated or
>> that it'd be difficult to update for this. Looking over the code, it's
>> got a simple regex for matching that line which would have to be
>> updated, but I don't think it'd require much more than that.

> i'll be happy to update the Pg::Explain to handle new elements of
> textual plans, so if this would be of concern - please don't treat
> "compatibility with explain.depesz.com" as your responsibility/problem.

The point isn't whether it'd be "terribly difficult" to update client
side EXPLAIN-parsing code ... it's whether we should break it in the
first place. I don't find the proposed format so remarkably
well-designed that it's worth creating compatibility problems for.

The main functional problem I see with this format is that it assumes
there is one and only one filter step associated with every plan node.
That is just plain wrong. Many don't have any, and there are important
cases where there are two. I'm thinking in particular that it might be
useful to distinguish the effects of the recheck and the filter
conditions of a bitmap heap scan. Maybe it'd also be interesting to
separate the join and non-join filter clauses of a join node, though
I'm less sure about the usefulness of that.

So the line I'm thinking we should pursue is to visually associate the
new counter with the filter condition, either like

Filter Cond: (x > 42) (nfiltered = 123)

or

Filter Cond: (x > 42)
Rows Filtered: 123

The latter is less ambiguous, but takes more vertical space. The former
is very unlikely to break any client code, because I doubt there is any
that inquires into the details of what a filter condition expression
really means. The latter *might* break code depending on how much
it assumes about the number of detail lines attached to a plan node
... but as Robert pointed out, we've added new detail lines before.

BTW, is it just me, or is the terminology "number filtered" pretty
confusing/ambiguous in itself? It doesn't seem at all clear to me
whether that's the number of rows passed by the filter condition or
the number of rows rejected. Perhaps "nremoved" would be clearer.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: depesz(at)depesz(dot)com, Stephen Frost <sfrost(at)snowman(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 20:53:43
Message-ID: AANLkTi=b-qKY_rUO6DtwFVe+UZdY4WGrgJvOdws+d=Ni@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 20, 2011 at 3:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> The main functional problem I see with this format is that it assumes
> there is one and only one filter step associated with every plan node.
> That is just plain wrong.  Many don't have any, and there are important
> cases where there are two.  I'm thinking in particular that it might be
> useful to distinguish the effects of the recheck and the filter
> conditions of a bitmap heap scan.

If it's not too hard to do that, I'm all in favor.

> Maybe it'd also be interesting to
> separate the join and non-join filter clauses of a join node, though
> I'm less sure about the usefulness of that.

That would also be extremely useful.

> So the line I'm thinking we should pursue is to visually associate the
> new counter with the filter condition, either like
>
>        Filter Cond: (x > 42)  (nfiltered = 123)
>
> or
>
>        Filter Cond: (x > 42)
>        Rows Filtered: 123
>
> The latter is less ambiguous, but takes more vertical space.  The former
> is very unlikely to break any client code, because I doubt there is any
> that inquires into the details of what a filter condition expression
> really means.  The latter *might* break code depending on how much
> it assumes about the number of detail lines attached to a plan node
> ... but as Robert pointed out, we've added new detail lines before.

I like the idea of putting it on the same line as the filter
condition, but your proposal for how to do that doesn't wow me - the
parentheses look too similar to the ones around the qual itself.

> BTW, is it just me, or is the terminology "number filtered" pretty
> confusing/ambiguous in itself?  It doesn't seem at all clear to me
> whether that's the number of rows passed by the filter condition or
> the number of rows rejected.  Perhaps "nremoved" would be clearer.

I think filtered is pretty clear and like it... removed sounds like
you deleted something.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Marko Tiikkaja" <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, <depesz(at)depesz(dot)com>,"Magnus Hagander" <magnus(at)hagander(dot)net>, <pgsql-hackers(at)postgresql(dot)org>, "Stephen Frost" <sfrost(at)snowman(dot)net>
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 20:57:15
Message-ID: 4D384D4B0200002500039904@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> I think filtered is pretty clear and like it...

I find it ambiguous. [Takes sip of filtered water.] How about
excluded?

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: depesz(at)depesz(dot)com, Stephen Frost <sfrost(at)snowman(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 21:32:28
Message-ID: 21412.1295559148@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Jan 20, 2011 at 3:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> BTW, is it just me, or is the terminology "number filtered" pretty
>> confusing/ambiguous in itself? It doesn't seem at all clear to me
>> whether that's the number of rows passed by the filter condition or
>> the number of rows rejected. Perhaps "nremoved" would be clearer.

> I think filtered is pretty clear and like it... removed sounds like
> you deleted something.

Well, you did delete something, no? There are rows that aren't in the
output that would have been there if not for the filter condition.

And, btw, one person thinking it's clear doesn't make it so. There
are actually three numbers that could be involved here: the number of
rows arriving at the filter, the number passed by it, and the number
rejected by it. I think that "nfiltered" could possibly mean any of
those three. A non-native speaker of English would be even less
likely to be sure of what was meant.

regards, tom lane


From: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: depesz(at)depesz(dot)com, Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-20 23:23:45
Message-ID: 4D38C401.4070808@cs.helsinki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/20/2011 12:47 PM, Tom Lane wrote:
> So the line I'm thinking we should pursue is to visually associate the
> new counter with the filter condition, either like
>
> Filter Cond: (x> 42) (nfiltered = 123)
>
> or
>
> Filter Cond: (x> 42)
> Rows Filtered: 123

I'd prefer the latter. Sometimes the Filter Cond is very complex and
finding the nfiltered information would be easier if it always had its
own row.

Regards,
Marko Tiikkaja


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, depesz(at)depesz(dot)com, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-hackers(at)postgresql(dot)org, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-21 02:06:34
Message-ID: AANLkTikS4tk4y8V36AiHnWFLKMp3ENY5n2BuaLZM3X=o@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 20, 2011 at 3:57 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> I think filtered is pretty clear and like it...
>
> I find it ambiguous.  [Takes sip of filtered water.]

Oh, you mean water that had some things you didn't want taken out of it?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: depesz(at)depesz(dot)com, Stephen Frost <sfrost(at)snowman(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: REVIEW: EXPLAIN and nfiltered
Date: 2011-01-21 02:08:10
Message-ID: AANLkTimtLut7EXTpzWu2P1rmt2==LD0RG5_Cjqt6PErH@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 20, 2011 at 4:32 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Thu, Jan 20, 2011 at 3:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> BTW, is it just me, or is the terminology "number filtered" pretty
>>> confusing/ambiguous in itself?  It doesn't seem at all clear to me
>>> whether that's the number of rows passed by the filter condition or
>>> the number of rows rejected.  Perhaps "nremoved" would be clearer.
>
>> I think filtered is pretty clear and like it...  removed sounds like
>> you deleted something.
>
> Well, you did delete something, no?  There are rows that aren't in the
> output that would have been there if not for the filter condition.

What I mean to say is that I fear that removed would give the
impression that some modification had been made to the database.
Perhaps that's silly, but it's what came to mind.

> And, btw, one person thinking it's clear doesn't make it so.

That's why I said "I think" rather than "Any fool should be able to see that".

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company