Re: graph representation of data structures in optimizer

Lists: pgsql-hackers
From: Adriano Lange <adriano(at)c3sl(dot)ufpr(dot)br>
To: pgsql-hackers(at)postgresql(dot)org
Subject: graph representation of data structures in optimizer
Date: 2009-02-18 15:22:02
Message-ID: 499C279A.3090602@c3sl.ufpr.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I'm interested in data representation and debug of optimizer routines.
Thus, I've changed the debug functions of allpaths.c to make a
graphviz-like output of RelOptInfo structure.

Any idea about this?
Is there some project or improvement like this?

Tanks,
Adriano Lange
C3SL/UFPR - www.c3sl.ufpr.br

Attachment Content-Type Size
RelOptInfo_graph.png image/png 150.3 KB

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Adriano Lange <adriano(at)c3sl(dot)ufpr(dot)br>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: graph representation of data structures in optimizer
Date: 2009-02-18 16:19:09
Message-ID: 87zlgjzpj6.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Adriano Lange <adriano(at)c3sl(dot)ufpr(dot)br> writes:

> Hi,
>
> I'm interested in data representation and debug of optimizer routines. Thus,
> I've changed the debug functions of allpaths.c to make a graphviz-like output
> of RelOptInfo structure.
>
> Any idea about this?
> Is there some project or improvement like this?

Several people have asked about ways to see what possible plans were
considered and why the were rejected, it was one of the repeat offenders in
the recent Postgres Pet Peeves thread so this is a very interesting area to
explore.

However I have to say this graph you've generated is amazingly hard to
decipher :) It took me a while to even figure out what information it was
presenting.

Worse, it's not useful unless you add a lot more information to it such as
what relations are actually being scanned or joined at each path which is
going to make it a hell of a lot harder to read.

I'm not sure how to do any better but I would be fascinated to see any new
images you generate :)

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Adriano Lange <adriano(at)c3sl(dot)ufpr(dot)br>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: graph representation of data structures in optimizer
Date: 2009-02-18 17:21:30
Message-ID: 603c8f070902180921x2c64996dne234fc196ea098de@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 18, 2009 at 10:22 AM, Adriano Lange <adriano(at)c3sl(dot)ufpr(dot)br> wrote:
> Hi,
>
> I'm interested in data representation and debug of optimizer routines. Thus,
> I've changed the debug functions of allpaths.c to make a graphviz-like
> output of RelOptInfo structure.
>
> Any idea about this?
> Is there some project or improvement like this?

That is pretty cool.

It would help a lot to label the baserels with their names.

You might also want to move the RestrictInfo out of line so that it's
easier to see where the inner and outer joinpath arrows are going.

It would be really sweet if there were some compact way to see the pathkeys.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Adriano Lange <adriano(at)c3sl(dot)ufpr(dot)br>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: graph representation of data structures in optimizer
Date: 2009-02-18 18:01:15
Message-ID: 9471.1234980075@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> Adriano Lange <adriano(at)c3sl(dot)ufpr(dot)br> writes:
>> I've changed the debug functions of allpaths.c to make a graphviz-like output
>> of RelOptInfo structure.

> However I have to say this graph you've generated is amazingly hard to
> decipher :) It took me a while to even figure out what information it was
> presenting.

> Worse, it's not useful unless you add a lot more information to it such as
> what relations are actually being scanned or joined at each path which is
> going to make it a hell of a lot harder to read.

Labeling the bottom-level scan paths with their relations would help a
lot. The label at the top level isn't real helpful.

But really I think the problem with this approach is that the
information density is too low --- imagine what it would look like in a
six-or-more-way join. I don't think the graphical approach is helpful
at all here.

Also, showing the final Path data structure has the problem that a lot
of the information someone might want is already gone, because we throw
away Paths that are determined to be dominated by other paths. The
question someone usually wants answered is "was a path of this structure
considered at all, and if so what was the estimated cost?". In a large
fraction of cases, that's not answerable from the paths that remain at
the end of the search. I think some sort of on-the-fly tracing of all
add_path calls might be a more useful approach.

regards, tom lane


From: Adriano Lange <adriano(at)c3sl(dot)ufpr(dot)br>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: graph representation of data structures in optimizer
Date: 2009-02-19 00:55:54
Message-ID: 499CAE1A.4020504@c3sl.ufpr.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane escreveu:
> Gregory Stark <stark(at)enterprisedb(dot)com> writes:
>> Adriano Lange <adriano(at)c3sl(dot)ufpr(dot)br> writes:
>>> I've changed the debug functions of allpaths.c to make a graphviz-like output
>>> of RelOptInfo structure.
>
>> However I have to say this graph you've generated is amazingly hard to
>> decipher :) It took me a while to even figure out what information it was
>> presenting.
>
>> Worse, it's not useful unless you add a lot more information to it such as
>> what relations are actually being scanned or joined at each path which is
>> going to make it a hell of a lot harder to read.
>
> Labeling the bottom-level scan paths with their relations would help a
> lot. The label at the top level isn't real helpful.
>
> But really I think the problem with this approach is that the
> information density is too low --- imagine what it would look like in a
> six-or-more-way join. I don't think the graphical approach is helpful
> at all here.

Certainly. That example had only three relations. Six relations in a
RelOptInfo will make a big graph and too hard to understand.

So, I will think about this for a while. A interesting thing for me is
to identify the in-degree pointers of each structure.

> Also, showing the final Path data structure has the problem that a lot
> of the information someone might want is already gone, because we throw
> away Paths that are determined to be dominated by other paths. The
> question someone usually wants answered is "was a path of this structure
> considered at all, and if so what was the estimated cost?". In a large
> fraction of cases, that's not answerable from the paths that remain at
> the end of the search. I think some sort of on-the-fly tracing of all
> add_path calls might be a more useful approach.
>
> regards, tom lane

Humm. This temporal approach may be dificult to represent in this
graphical mode. I guess that the text-like pretty_format_node_dump()
representation and diff are yet more usefull for this.

--
Adriano Lange
C3SL/UFPR - www.c3sl.ufpr.br


From: Adriano Lange <adriano(at)c3sl(dot)ufpr(dot)br>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: graph representation of data structures in optimizer
Date: 2009-02-19 01:12:53
Message-ID: 499CB215.1060201@c3sl.ufpr.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas escreveu:
> That is pretty cool.
>
> It would help a lot to label the baserels with their names.
>
> You might also want to move the RestrictInfo out of line so that it's
> easier to see where the inner and outer joinpath arrows are going.

Humm. Maybe this is not easy to do in dot command line graph generator.
Perhaps I should to try this in other application. The output generated
by debug is only a text plain description of vertex and edges, without
any information about position or path. See the attached file.

> It would be really sweet if there were some compact way to see the pathkeys.

Several attributes and objects are missing yet, but I will add them.

Thanks,

Adriano Lange
C3SL/UFPR - www.c3sl.ufpr.br

Attachment Content-Type Size
RelOptInfo_graph.dot application/x-crossover-dot 8.6 KB

From: Adriano Lange <adriano(at)c3sl(dot)ufpr(dot)br>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: graph representation of data structures in optimizer
Date: 2009-02-19 20:17:54
Message-ID: 499DBE72.1040709@c3sl.ufpr.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane escreveu:
> But really I think the problem with this approach is that the
> information density is too low --- imagine what it would look like in a
> six-or-more-way join. I don't think the graphical approach is helpful
> at all here.

I was thinking about the hard visualization and navigability in the
graph. I think that a good solution for this would be a dynamic
navigability in their nodes and a rearrange their positions by focused
node. I'm not remembering now, but I've saw a tool like this.

Thanks

--
Adriano Lange
C3SL/UFPR - www.c3sl.ufpr.br


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Adriano Lange <adriano(at)c3sl(dot)ufpr(dot)br>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: graph representation of data structures in optimizer
Date: 2009-02-19 20:33:40
Message-ID: 200902192233.41370.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday 19 February 2009 22:17:54 Adriano Lange wrote:
> Tom Lane escreveu:
> > But really I think the problem with this approach is that the
> > information density is too low --- imagine what it would look like in a
> > six-or-more-way join. I don't think the graphical approach is helpful
> > at all here.
>
> I was thinking about the hard visualization and navigability in the
> graph. I think that a good solution for this would be a dynamic
> navigability in their nodes and a rearrange their positions by focused
> node. I'm not remembering now, but I've saw a tool like this.

http://prefuse.org could be useful for that.


From: Adriano Lange <adriano(at)c3sl(dot)ufpr(dot)br>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: graph representation of data structures in optimizer
Date: 2009-02-22 23:04:56
Message-ID: 49A1DA18.5040803@c3sl.ufpr.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

The patch attached is an implementation for graph generation of data
structures in postgres. The file debuggraph.c is a simple library that
generate graphs in the format supported by graphviz. Example:

#include "nodes/debuggraph.h"
...
DebugGraph *graph;
DebugNode *node;

graph = createDebugGraph();
node = newDebugNode( graph, "node1", "Node 1" );
addDebugNodeAttribute( node, "Attribute 1", "Value 1" );
newDebugNode( graph, "node2", "Node 2" );
newDebugEdge( graph, "node1", "node2", "Edge 1" );

printGraphvizToFile( graph, stderr );
destroyDebugGraph( graph );

I've also made a reimplementation of outfuncs.c to outfuncs_graph.c for
an entire node output support.

#include "nodes/outfuncs_graph.h"
...
printGraphNodes( node, stderr );

I hope that this may be useful for somebody.

--
Adriano Lange
C3SL/UFPR - www.c3sl.ufpr.br

Attachment Content-Type Size
postgresql-8.3.5-debuggraph-20090222.patch text/x-diff 91.7 KB

From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Adriano Lange <adriano(at)c3sl(dot)ufpr(dot)br>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: graph representation of data structures in optimizer
Date: 2009-02-27 01:06:50
Message-ID: 20090227095936.AB47.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Adriano Lange <adriano(at)c3sl(dot)ufpr(dot)br> wrote:

> The patch attached is an implementation for graph generation of data
> structures in postgres. The file debuggraph.c is a simple library that
> generate graphs in the format supported by graphviz.

It's interesting, but I don't think it is suitable for a core feature.
Could you split the patch into a core-patch and an extension module?
The extension module would be put in contrib or in a pgFoundry project.
XML might be good for communications between the core and the module;
XML-explain was ongoingly discussed, but had not been completed yet.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center