Re: Ordering in an aggregate -- points to paths

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Julian Scarfe" <julian(dot)scarfe(at)ntlworld(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Ordering in an aggregate -- points to paths
Date: 2003-06-15 15:49:57
Message-ID: 13056.1055692197@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

"Julian Scarfe" <julian(dot)scarfe(at)ntlworld(dot)com> writes:
> OK, I know relying on ordering in an aggregate is sinful, but I don't know
> if it's mortal or venial.
> ...
> SELECT c.fir_ident, c.fir_indicator, create_path (c.node) AS fir_edge
> INTO fir_e
> FROM
> (SELECT fir_ident, fir_indicator, node
> FROM fir_coords
> ORDER BY fir_ident, fir_indicator,seq_no) c
> GROUP BY fir_ident, fir_indicator;

Yeah, this is a fairly obvious thing to want to do with a user-written
aggregate. It does not work in released versions, because the planner
does not notice that the inner SELECT's output ordering matches what
the GROUP BY needs, and so it inserts an additional Sort plan step
above the sub-select (you can see this if you look at EXPLAIN output).
Unfortunately, on most platforms qsort() isn't stable and will not
preserve the ordering of its input for equal keys. So you lose the
minor ordering by seq_no in the re-sort.

We have fixed this in CVS tip by teaching the planner to notice the
subselect's result ordering and avoid the redundant Sort step. The
patch is probably too large to consider back-patching into 7.3,
unfortunately. Here's the log entry if you want to pursue that:

2003-02-15 15:12 tgl

* src/: backend/optimizer/path/allpaths.c,
backend/optimizer/path/pathkeys.c,
backend/optimizer/plan/planner.c,
backend/optimizer/util/pathnode.c,
backend/optimizer/util/relnode.c, backend/optimizer/util/tlist.c,
include/optimizer/pathnode.h, include/optimizer/paths.h,
include/optimizer/tlist.h: Teach planner how to propagate pathkeys
from sub-SELECTs in FROM up to the outer query. (The
implementation is a bit klugy, but it would take nontrivial
restructuring to make it nicer, which this is probably not worth.)
This avoids unnecessary sort steps in examples like SELECT
foo,count(*) FROM (SELECT ... ORDER BY foo,bar) sub GROUP BY foo
which means there is now a reasonable technique for controlling the
order of inputs to custom aggregates, even in the grouping case.

regards, tom lane

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Julian Scarfe 2003-06-15 16:48:40 Re: Ordering in an aggregate -- points to paths
Previous Message Julian Scarfe 2003-06-15 15:13:14 Ordering in an aggregate -- points to paths