Re: Explain XML patch v2

Lists: pgsql-hackerspgsql-patches
From: Tom Raney <raneyt(at)cecs(dot)pdx(dot)edu>
To: pgsql-patches(at)postgresql(dot)org
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Explain XML patch v2
Date: 2008-07-02 04:48:25
Message-ID: 20080701214825.m2eczeym8404ss4g@webmail.cecs.pdx.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

This is an update to my EXPLAIN XML patch submitted a few days ago.

I've added a documentation patch and modified some of the code per
comments by Gregory Stark.

Because the main consumer of output generated by this patch will
presumably be a machine, I didn't clutter up the documentation with
matching XML output for every standard example query. But, I added
enough to hopefully give the user an idea of what to expect.

Regards,

Tom Raney

Attachment Content-Type Size
exml_patch_v2 text/plain 41.8 KB

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Raney" <raneyt(at)cecs(dot)pdx(dot)edu>
Cc: <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Explain XML patch v2
Date: 2008-07-02 09:25:51
Message-ID: 87fxqsr61s.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Tom Raney" <raneyt(at)cecs(dot)pdx(dot)edu> writes:

> This is an update to my EXPLAIN XML patch submitted a few days ago.
>
> I've added a documentation patch and modified some of the code per comments by
> Gregory Stark.

You should update the wiki at

http://wiki.postgresql.org/wiki/CommitFest:2008-07

so any reviewers look at the updated patch.

(I certainly don't see any reason to wait two months for the next commit fest)

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-patches(at)postgresql(dot)org
Cc: Tom Raney <raneyt(at)cecs(dot)pdx(dot)edu>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: Explain XML patch v2
Date: 2008-07-02 15:57:29
Message-ID: 200807021757.30278.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Am Mittwoch, 2. Juli 2008 schrieb Tom Raney:
> This is an update to my EXPLAIN XML patch submitted a few days ago.

Could you explain how you came up with the XML schema design? I suppose you
just made something up that went along with the existing XML output.

I would like to see more integration with the spirit of the existing XML
functionality. For example, instead of things like

"<runtime ms=\"%.3f\" />\n"

we ought to be using XML Schema data types for time intervals and so on.

We might also want to use an XML namespace.

Table and index names should be escaped using the existing escape mechanism
for identifiers. There might also be encoding issues.

It would also be interesting if EXPLAIN could optionally be a function that
returns a datum of type XML, to allow further processing.

Any thoughts on these issues?


From: David Fetter <david(at)fetter(dot)org>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-patches(at)postgresql(dot)org, Tom Raney <raneyt(at)cecs(dot)pdx(dot)edu>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: Explain XML patch v2
Date: 2008-07-02 16:01:18
Message-ID: 20080702160118.GD30328@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, Jul 02, 2008 at 05:57:29PM +0200, Peter Eisentraut wrote:
> It would also be interesting if EXPLAIN could optionally be a
> function that returns a datum of type XML, to allow further
> processing.

It would be better to have a function which allows people to plug in
their own serialization. A JSON or YAML one, for example, would be
much lighter weight on both ends.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: "Dave Page" <dpage(at)pgadmin(dot)org>
To: "Peter Eisentraut" <peter_e(at)gmx(dot)net>
Cc: pgsql-patches(at)postgresql(dot)org, "Tom Raney" <raneyt(at)cecs(dot)pdx(dot)edu>, "Gregory Stark" <stark(at)enterprisedb(dot)com>
Subject: Re: Explain XML patch v2
Date: 2008-07-02 16:17:28
Message-ID: 937d27e10807020917n785ba632wf57439fbfa62e5c7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, Jul 2, 2008 at 4:57 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> Am Mittwoch, 2. Juli 2008 schrieb Tom Raney:
>> This is an update to my EXPLAIN XML patch submitted a few days ago.
>
> Could you explain how you came up with the XML schema design? I suppose you
> just made something up that went along with the existing XML output.

Speaking of schema - I haven't had time to review the patch myself
yet, but does it include schema names for all relations? The current
text output does not (and adding it has been rejected due to
verbosity), but that makes any kind of query tuning or information
gathering tool more or less impossible to write.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


From: daveg <daveg(at)sonic(dot)net>
To: David Fetter <david(at)fetter(dot)org>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-patches(at)postgresql(dot)org, Tom Raney <raneyt(at)cecs(dot)pdx(dot)edu>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: Explain XML patch v2
Date: 2008-07-02 19:39:32
Message-ID: 20080702193932.GO866@sonic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, Jul 02, 2008 at 09:01:18AM -0700, David Fetter wrote:
> On Wed, Jul 02, 2008 at 05:57:29PM +0200, Peter Eisentraut wrote:
> > It would also be interesting if EXPLAIN could optionally be a
> > function that returns a datum of type XML, to allow further
> > processing.
>
> It would be better to have a function which allows people to plug in
> their own serialization. A JSON or YAML one, for example, would be
> much lighter weight on both ends.

+1 for either of these.

-dg

--
David Gould daveg(at)sonic(dot)net 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.


From: Tom Raney <raneyt(at)cs(dot)pdx(dot)edu>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-patches(at)postgresql(dot)org, Tom Raney <raneyt(at)cecs(dot)pdx(dot)edu>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: Explain XML patch v2
Date: 2008-07-02 20:05:38
Message-ID: 486BDF92.8090700@cs.pdx.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Peter Eisentraut wrote:
> Am Mittwoch, 2. Juli 2008 schrieb Tom Raney:
>
>> This is an update to my EXPLAIN XML patch submitted a few days ago.
>>
>
> Could you explain how you came up with the XML schema design? I suppose you
> just made something up that went along with the existing XML output.
>
Yes, it is based on the existing output.
> I would like to see more integration with the spirit of the existing XML
> functionality. For example, instead of things like
>
> "<runtime ms=\"%.3f\" />\n"
>
> we ought to be using XML Schema data types for time intervals and so on.
>
The DTD provides only rudimentary document validation but it has no
support for type checking. So, it may make sense to move to the more
rigorous XML Schema. There is a 'duration' data type that could be used
for the instance listed above. Or, we could define our own.
> We might also want to use an XML namespace.
>
Taking the 'ms' field listed above:
xmlns="http://www.postgresql.org/v8.4/ms" or something like this?
> Table and index names should be escaped using the existing escape mechanism
> for identifiers. There might also be encoding issues.
>
That's a good point. Or, wrap them with CDATA.
> It would also be interesting if EXPLAIN could optionally be a function that
> returns a datum of type XML, to allow further processing.
>
> Any thoughts on these issues?
>
I am in the process of writing a parser of this XML output for the Red
Hat Visual Explain tool. I want to see what surprises come up during
implementation.


From: Tom Raney <raneyt(at)cecs(dot)pdx(dot)edu>
To: daveg <daveg(at)sonic(dot)net>
Cc: David Fetter <david(at)fetter(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-patches(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] Explain XML patch v2
Date: 2008-07-04 07:07:49
Message-ID: 20080704000749.motuh47huog884cs@webmail.cecs.pdx.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Quoting daveg <daveg(at)sonic(dot)net>:

> On Wed, Jul 02, 2008 at 09:01:18AM -0700, David Fetter wrote:
>> On Wed, Jul 02, 2008 at 05:57:29PM +0200, Peter Eisentraut wrote:
>> > It would also be interesting if EXPLAIN could optionally be a
>> > function that returns a datum of type XML, to allow further
>> > processing.
>>
>> It would be better to have a function which allows people to plug in
>> their own serialization. A JSON or YAML one, for example, would be
>> much lighter weight on both ends.
>
> +1 for either of these.
>
> -dg
>

So, this leads me to the idea of assembling the EXPLAIN data
internally in an output-neutral data structure. At the very end of
processing, one decision statement would decide which output plugin to
use for output. Sprinkling XML print statements throughout the code
(as currently done in the patch) while functional, is not ideal. And,
the escaping of XML content should ideally be done in the serializer
anyway.

Of course, this realization didn't occur to me until *after* I had
spent a bit of time coding up the patch in its current form. Oh well.

Thoughts?

Regarding the XML datum, in order to support that, will all users need
to compile with libxml? Are there any lighter weight solutions to
serialize other than libxml?

-Tom Raney


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-patches(at)postgresql(dot)org
Cc: Tom Raney <raneyt(at)cecs(dot)pdx(dot)edu>, daveg <daveg(at)sonic(dot)net>, David Fetter <david(at)fetter(dot)org>, Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] Explain XML patch v2
Date: 2008-07-04 08:51:55
Message-ID: 200807041051.56610.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Am Freitag, 4. Juli 2008 schrieb Tom Raney:
> Regarding the XML datum, in order to support that, will all users need  
> to compile with libxml?  Are there any lighter weight solutions to  
> serialize other than libxml?

You can create XML without libxml.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-patches(at)postgresql(dot)org, Tom Raney <raneyt(at)cecs(dot)pdx(dot)edu>, daveg <daveg(at)sonic(dot)net>, David Fetter <david(at)fetter(dot)org>, Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] Explain XML patch v2
Date: 2008-07-04 16:01:12
Message-ID: 23468.1215187272@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> Am Freitag, 4. Juli 2008 schrieb Tom Raney:
>> Regarding the XML datum, in order to support that, will all users need
>> to compile with libxml? Are there any lighter weight solutions to
>> serialize other than libxml?

> You can create XML without libxml.

Seems to me that anyone who wants this feature will probably also want
the existing libxml-based features, so they'll be building with libxml
anyway. So I'd not be in favor of expending any extra code on a
roll-your-own solution.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Raney <raneyt(at)cecs(dot)pdx(dot)edu>
Cc: pgsql-patches(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: Explain XML patch v2
Date: 2008-07-04 16:20:48
Message-ID: 1215188448.4051.225.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Tue, 2008-07-01 at 21:48 -0700, Tom Raney wrote:
> This is an update to my EXPLAIN XML patch submitted a few days ago.

This is an important patch and you've done well to get it this far.

I have much detailed feedback, but these are just emergent requests, now
I can see and understand what you've done.

* If the patch was implemented as an ExplainOneQuery_hook we would be
able to use this with 8.3 also, which might be interesting. So there's
no real need for this to be a patch on core.

* If I understand, the DTD option inlines the DTD into the resulting XML
document, rather than adding a schema definition?

* The DTD should be in a separate file also, so it can be referred to.
We should give that DTD a permanent URL so we can identify it across the
internet. This is the first time the project has published an XML
DTD/Schema, so we need to think through the right way to publish it. The
DTD should have a version number in it, so when this gets updated in the
future we can validate against the correct one.

* The DTD is regrettably a long, long way from where I need it to be.
The PLAN elements are all unrelated to one another, apart from their
sequence within the XML document and their "indent". That definition can
lead to XML documents that match the DTD yet would never be valid plans,
amongst other problems. For sensible analysis of the SQL we need the DTD
to match the actual structure of nodes in the executor. This requires
major DTD changes, regrettably. The "indent" comes from the nesting of
the nodes, and is not merely a visual feature of the complete plan. IMHO
it is critically important that the XML correctly conveys the structure
of the plan and not just the formatting of the current text output.
Otherwise many doors will be closed to us and this whole project wasted.
I want to be able to write xml queries to retrieve plans that contain a
merge join where both the inner and outer are merge joins, or to easily
compare the structure of two complex queries looking for differences.

* The meaning of the two time attributes is somewhat confused. It is
startuptime and totaltime, not time_start and time_end.

* It looks to me like QUALIFIER alone is not enough, since in some cases
nodes have both an index condition and a filter, for example.

* I've made a number of suggested changes below, but the DTD really
needs to follow the structures in src/include/nodes/plannodes.h
In particular you'll see the recursive structure introduced by the DTD
changes below

* EXPLAIN has been renamed PLAN
* PLAN has been renamed NODE
* COST has been renamed to ESTIMATE
* ANALYZE has been renamed to ACTUAL
* OUTPUT, SORT and QUALIFIER have been removed for clarity only

<!ELEMENT plan (estimate, runtime?)>

<!ELEMENT node (table?, index?, estimate, actual?, outerpath?, innerpath?, initpath)>
<!ATTLIST plan\n"
nodetype CDATA #REQUIRED>

<!ELEMENT outerpath (node)>
<!ELEMENT innerpath (node)>
<!ELEMENT initpath (node)>

<!ELEMENT estimate EMPTY >

<!ATTLIST estimate
startupcost CDATA #REQUIRED
totalcost CDATA #REQUIRED
rows CDATA #REQUIRED
width CDATA #REQUIRED>

<!ELEMENT actual EMPTY >
<!ATTLIST actual\n"
startuptime CDATA #REQUIRED
totaltime CDATA #REQUIRED
rows CDATA #REQUIRED
loops CDATA #REQUIRED>

* I'd much prefer to define this as a Schema, since that's a bit more
flexible in what we can do, plus we can specify datatypes.

* Based upon some of those issues, I'd suggest that further work on the
DTD/Schema should only happen when a reasonable range of plans have been
accumulated to allow full understanding of the possibilities

I'm sorry I've found so much to say about this, but don't be dissuaded.
The basic structure of the patch is there, we just need to get the
details right also, so we can take full.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: "Dave Page" <dpage(at)pgadmin(dot)org>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Tom Raney" <raneyt(at)cecs(dot)pdx(dot)edu>, pgsql-patches(at)postgresql(dot)org, "Gregory Stark" <stark(at)enterprisedb(dot)com>
Subject: Re: Explain XML patch v2
Date: 2008-07-04 18:38:13
Message-ID: 937d27e10807041138u3b69dd27ra9d725d231f0c400@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Fri, Jul 4, 2008 at 5:20 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> * If the patch was implemented as an ExplainOneQuery_hook we would be
> able to use this with 8.3 also, which might be interesting. So there's
> no real need for this to be a patch on core.

Would that mean that instead of being an optional format that pgAdmin
could request with appropriate grammar, it would be something that was
only available if the plugin was loaded, and would replace regular
EXPLAIN output? If so, that would be extremely inconvenient, and next
to useless for general purpose utilities.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: Tom Raney <raneyt(at)cecs(dot)pdx(dot)edu>, pgsql-patches(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: Explain XML patch v2
Date: 2008-07-05 09:41:18
Message-ID: 1215250878.4051.268.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Fri, 2008-07-04 at 19:38 +0100, Dave Page wrote:
> On Fri, Jul 4, 2008 at 5:20 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> >
> > * If the patch was implemented as an ExplainOneQuery_hook we would be
> > able to use this with 8.3 also, which might be interesting. So there's
> > no real need for this to be a patch on core.
>
> Would that mean that instead of being an optional format that pgAdmin
> could request with appropriate grammar, it would be something that was
> only available if the plugin was loaded, and would replace regular
> EXPLAIN output? If so, that would be extremely inconvenient, and next
> to useless for general purpose utilities.

It can be optional since plugins can add parameters also.

It wouldn't take long to make up a plugin for 8.3 once this patch has
been committed to core for 8.4, so if you're saying you'd definitely
like it in core then I'm OK with that.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Raney <raneyt(at)cecs(dot)pdx(dot)edu>
Cc: pgsql-patches(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: Explain XML patch v2
Date: 2008-07-05 09:50:35
Message-ID: 1215251435.4051.274.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Fri, 2008-07-04 at 17:20 +0100, Simon Riggs wrote:
> <!ELEMENT plan (estimate, runtime?)>

Sorry, I noticed a few typos in my sample DTD.

<!ELEMENT plan (node+)> with some additional elements for stats

and the attlist below was for the <node> not <plan> element

<!ATTLIST node
nodetype CDATA #REQUIRED>

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: "Dave Page" <dpage(at)pgadmin(dot)org>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Tom Raney" <raneyt(at)cecs(dot)pdx(dot)edu>, pgsql-patches(at)postgresql(dot)org, "Gregory Stark" <stark(at)enterprisedb(dot)com>
Subject: Re: Explain XML patch v2
Date: 2008-07-05 15:00:41
Message-ID: 937d27e10807050800q6416aefbx6699959f3f56e9f0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Sat, Jul 5, 2008 at 10:41 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> It can be optional since plugins can add parameters also.

GUCs I assume you mean, not grammar. Unless I'm misreading the code
though, if the plugin is there it will always run instead of the
regular explain code, so presumabiy that's optional as in XML or
nothing, not XML or standard output.

> It wouldn't take long to make up a plugin for 8.3 once this patch has
> been committed to core for 8.4, so if you're saying you'd definitely
> like it in core then I'm OK with that.

If i's always there it's definitely more useful to pgAdmin, and
doesn't require that we instruct users to install more server side
plugin code to use the features they want.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: Tom Raney <raneyt(at)cecs(dot)pdx(dot)edu>, pgsql-patches(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: Explain XML patch v2
Date: 2008-07-05 15:46:29
Message-ID: 1215272789.4051.304.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Sat, 2008-07-05 at 16:00 +0100, Dave Page wrote:
> On Sat, Jul 5, 2008 at 10:41 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> >
> > It can be optional since plugins can add parameters also.
>
> GUCs I assume you mean, not grammar. Unless I'm misreading the code
> though, if the plugin is there it will always run instead of the
> regular explain code, so presumabiy that's optional as in XML or
> nothing, not XML or standard output.

You can easily make it switchable between XML and normal.

> > It wouldn't take long to make up a plugin for 8.3 once this patch has
> > been committed to core for 8.4, so if you're saying you'd definitely
> > like it in core then I'm OK with that.
>
> If i's always there it's definitely more useful to pgAdmin, and
> doesn't require that we instruct users to install more server side
> plugin code to use the features they want.

As I said, I'm OK with that.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support