Plan targetlists in EXPLAIN output

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Plan targetlists in EXPLAIN output
Date: 2008-04-17 16:34:31
Message-ID: 21294.1208450071@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

For debugging the planner work I'm about to do, I'm expecting it will be
useful to be able to get EXPLAIN to print the targetlist of each plan
node, not just the quals (conditions) as it's historically done. My
first instinct is just to stick in the code under a debugging #ifdef,
but I wonder if anyone wants to argue for making it more easily
available?

I think it'd be a mistake to turn it on by default, because it'd add a
line for every plan node, which'd be an awful lot of bloat in output
that's hard enough to read already. And experience has shown that
99.99% of the time people don't need the info. Still, there's that
other 0.01%.

I'm tempted to propose redefining the currently-nearly-useless
EXPLAIN VERBOSE option as doing this.

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Plan targetlists in EXPLAIN output
Date: 2008-04-17 17:28:15
Message-ID: 874pa0crww.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> For debugging the planner work I'm about to do, I'm expecting it will be
> useful to be able to get EXPLAIN to print the targetlist of each plan
> node, not just the quals (conditions) as it's historically done. My
> first instinct is just to stick in the code under a debugging #ifdef,
> but I wonder if anyone wants to argue for making it more easily
> available?

Yes please.

> I think it'd be a mistake to turn it on by default, because it'd add a
> line for every plan node, which'd be an awful lot of bloat in output
> that's hard enough to read already. And experience has shown that
> 99.99% of the time people don't need the info. Still, there's that
> other 0.01%.
>
> I'm tempted to propose redefining the currently-nearly-useless
> EXPLAIN VERBOSE option as doing this.

EXPLAIN VERBOSE is indeed ridiculous. The only downside is that people
following modern instructions on old installs will be sad. But I'm fine with
that.

IMHO You could also move "width" to VERBOSE while you're at it. In fact you'll
probably want width in precisely the same cases where you want the target
list.

I think down the road we'll have a few different independent data sets you can
get out of explain or at least explain analyze. I want to get i/o stats in
there which I think you'll want to turn on and off as a group, for example.
But perhaps by the time we do that someone will have done XML explain and
it'll be irrelevant. I can't think of any nice syntax to do that offhand
anyways. So +1 for just redefining VERBOSE.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Plan targetlists in EXPLAIN output
Date: 2008-04-17 17:45:39
Message-ID: 23151.1208454339@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> EXPLAIN VERBOSE is indeed ridiculous.

There are other ways to get that printout, too, if you really do need
it.

> IMHO You could also move "width" to VERBOSE while you're at it. In fact you'll
> probably want width in precisely the same cases where you want the target
> list.

I'm not convinced. The width is often useful to understand why the
planner did something (eg, chose a hash plan or not). The exact
contents of the targetlist are usually not nearly as interesting.

> So +1 for just redefining VERBOSE.

Barring other objections I'll go do that.

BTW, while testing the code I already found a bug:

regression=# set enable_hashagg to 0;
SET
regression=# explain select thousand from tenk1 group by 1;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Group (cost=1122.39..1172.39 rows=998 width=4)
Output: thousand
-> Sort (cost=1122.39..1147.39 rows=10000 width=4)
Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, string4
Sort Key: thousand
-> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=4)
Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, string4
(7 rows)

Only the "thousand" column is needed, so why is it emitting all columns?
It's evidently allowing the "use physical tlist" optimization to fire,
which saves cycles inside the SeqScan node --- but in this context
that's penny-wise and pound-foolish, because we're pumping useless data
through the Sort. There is code in the planner that's supposed to
notice the needs of the next level up, but it's not getting this case
right for some reason...

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Plan targetlists in EXPLAIN output
Date: 2008-04-17 18:09:04
Message-ID: 87ve2gbbgf.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> I'm not convinced. The width is often useful to understand why the
> planner did something (eg, chose a hash plan or not). The exact
> contents of the targetlist are usually not nearly as interesting.

I've never seen a single post on any of the lists where anyone went through
that exercise though.

> -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=4)
> Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, string4

I wonder if I even understand what width means. Or does the planner think most
of these columns are mostly null?

Or is it estimating the width based on the belief that only the thousand
column is actually going to be emitted?

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Plan targetlists in EXPLAIN output
Date: 2008-04-17 18:22:59
Message-ID: 23762.1208456579@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> Or is it estimating the width based on the belief that only the thousand
> column is actually going to be emitted?

Right. The width is used to estimate how much space would be needed
for, eg, sorting or hashing the plan node's output. In any case where
something like that is actually happening, we *should* be emitting only
the required columns, so I didn't see any particular need to make
use_physical_tlist change the reported width. OTOH this bug shows that
maybe that was hiding useful information ...

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Plan targetlists in EXPLAIN output
Date: 2008-04-17 18:42:49
Message-ID: 1208457769.4259.403.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2008-04-17 at 12:34 -0400, Tom Lane wrote:

> I'm tempted to propose redefining the currently-nearly-useless
> EXPLAIN VERBOSE option as doing this.

Yes please.

Sounds like a good home for other useful things also.

I'd like to have an EXPLAIN mode that displayed the plan without *any*
changeable info (i.e. no costs, row counts etc). This would then allow
more easy determination of whether plans had changed over time. (But
EXPLAIN TERSE sounds silly).

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Plan targetlists in EXPLAIN output
Date: 2008-04-17 23:23:19
Message-ID: Pine.GSO.4.64.0804171919550.9789@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 17 Apr 2008, Tom Lane wrote:

> For debugging the planner work I'm about to do, I'm expecting it will be
> useful to be able to get EXPLAIN to print the targetlist of each plan
> node, not just the quals (conditions) as it's historically done.

I've heard that some of the academic users of PostgreSQL were hoping to
add features in this area in order to allow better using planner internals
for educational purposes. It would be nice if that were available for
such purposes without having to recompile.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: PFC <lists(at)peufeu(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Plan targetlists in EXPLAIN output
Date: 2008-04-17 23:41:29
Message-ID: op.t9r33fjycigqcu@apollo13.peufeu.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 17 Apr 2008 20:42:49 +0200, Simon Riggs <simon(at)2ndquadrant(dot)com>
wrote:

> On Thu, 2008-04-17 at 12:34 -0400, Tom Lane wrote:
>
>> I'm tempted to propose redefining the currently-nearly-useless
>> EXPLAIN VERBOSE option as doing this.
>
> Yes please.
>
> Sounds like a good home for other useful things also.
>
> I'd like to have an EXPLAIN mode that displayed the plan without *any*
> changeable info (i.e. no costs, row counts etc). This would then allow
> more easy determination of whether plans had changed over time. (But
> EXPLAIN TERSE sounds silly).
>

Plan = Tree
Tree = XML

EXPLAIN ANALYZE SELECT * FROM test NATURAL JOIN test2 WHERE id
=ANY('{3,666,975,521'});
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=17.04..65.13 rows=1 width=8) (actual
time=51.835..51.835 rows=0 loops=1)
Join Filter: (test.value = test2.value)
-> Bitmap Heap Scan on test (cost=17.04..31.96 rows=4 width=8)
(actual time=16.622..16.631 rows=4 loops=1)
Recheck Cond: (id = ANY ('{3,666,975,521}'::integer[]))
-> Bitmap Index Scan on test_pkey (cost=0.00..17.04 rows=4
width=0) (actual time=16.613..16.613 rows=4 loops=1)
Index Cond: (id = ANY ('{3,666,975,521}'::integer[]))
-> Index Scan using test2_pkey on test2 (cost=0.00..8.28 rows=1
width=8) (actual time=8.794..8.795 rows=1 loops=4)
Index Cond: (test2.id = test.id)

EXPLAIN XML ...

<NestedLoop>
<Join Filter="(test.value = test2.value)">
<BitmapHeapScan Target="test" RecheckCond="(id) = ANY ($1)" />
<BitmapIndexScan Index="test_pkey" Cond="id = ANY ('$1'::integer[]))" />
</Join>
<IndexScan Index="test2_pkey" Target="test2" Cond="test2.id = test.id" />
</NestedLoop>

Nicely parsable and displayable in all its glory in pgadmin ;)


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: PFC <lists(at)peufeu(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Plan targetlists in EXPLAIN output
Date: 2008-04-18 01:57:44
Message-ID: 20080418015744.GX3846@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

PFC wrote:

> Plan = Tree
> Tree = XML

If you want to propose a DTD I'm sure there would be many people
interested.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Tom Raney <twraney(at)comcast(dot)net>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Plan targetlists in EXPLAIN output
Date: 2008-06-19 22:20:26
Message-ID: 485ADBAA.9080702@comcast.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I have been working on a project (for GSOC) to retrieve
planner/optimizer details. As part of the project, I need machine
parsable output. So, I thought I would dust off a patch I found from
last year that Germán Caamaño submitted. I didn't see any further
activity there so I integrated it into 8.4 and added a DTD.

The output below is generated by using the added flag 'XML' to the
EXPLAIN command. The DTD probably wouldn't be needed for every output
instance and may need its own flag.

I am coming up to speed on the planner internals, but it seems like this
first EXPLAIN XML concept may have some use. Are there any strong
opinions about the XML hierarchy? Is it enough to simply wrap the text
output from EXPLAIN with XML tags?

-Tom Raney

QUERY PLAN
-------------------------------------------------------------------
<?xml version="1.0"?>

<!DOCTYPE explain
[
<!ELEMENT explain (plan+) >
<!ELEMENT plan (table?, cost, qualifier?) >
<!ELEMENT table EMPTY >
<!ELEMENT cost EMPTY >
<!ELEMENT qualifier EMPTY >
<!ATTLIST explain
version CDATA #REQUIRED >
<!ATTLIST plan
name CDATA #REQUIRED
level CDATA #REQUIRED >
<!ATTLIST cost
startup CDATA #REQUIRED
total CDATA #REQUIRED
rows CDATA #REQUIRED
width CDATA #REQUIRED >
<!ATTLIST table
name CDATA #REQUIRED >
<!ATTLIST qualifier
type CDATA #REQUIRED
value CDATA #REQUIRED >
]>

<explain version="8.4devel">
<plan name="Seq Scan" level="0">
<table name="tenk1"/>
<cost startup="0.00" total="445.00" rows="10000" width="244" />
</plan>
</explain>
(32 rows)

Greg Smith wrote:
> On Thu, 17 Apr 2008, Tom Lane wrote:
>
>> For debugging the planner work I'm about to do, I'm expecting it will be
>> useful to be able to get EXPLAIN to print the targetlist of each plan
>> node, not just the quals (conditions) as it's historically done.
>
> I've heard that some of the academic users of PostgreSQL were hoping
> to add features in this area in order to allow better using planner
> internals for educational purposes. It would be nice if that were
> available for such purposes without having to recompile.
>
> --
> * Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD
>