Implementation of GROUPING SETS (T431: Extended grouping capabilities)

From: Олег Царев <zabivator(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Date: 2009-05-10 12:57:06
Message-ID: 54f48e4f0905100557r216dbd20j2ceb1e99d7c39004@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello all.
Please, approve my ideas for implementation.

Standart has feature T431: Extended grouping capabilities.
This feature i found in TODO-list:
http://wiki.postgresql.org/wiki/Todo -> SQL Commands -> TO DO

MS SQL 2005 partial support this feature:
http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
http://blogs.msdn.com/craigfr/archive/2007/09/21/aggregation-with-rollup.aspx

MS SQL 2008 support this feature:
http://blogs.msdn.com/craigfr/archive/2007/10/11/grouping-sets-in-sql-server-2008.aspx

Oracle support this feature:
http://www.compshack.com/sql/oracle-group-rollup

So, it's short notes about GROUPING SETS, but more complete
information have in a official documentation of MS SQL and Oracle
(copyright limited for send as attach).

First. GROUPG SETS.

select A,B,C,SUM(D) from table group by GROUPING SETS( (A,B,C), (A),
() ) - it's example of use grouping sets.
Semantic of this construction - make group by over source more, than
one group of column.
It's very wide key - A,B C. In result set of this example we can find
result set of select select A,B,C,SUM(D) from table group by A,B,C -
as subset. It's mind: "GROUP BY A,B,C" - subset of "GROUP BY GROUPING
SETS( (A,B,C), (A), () )
Two subset - is GROUP BY A B, and instead C column we look NULL.
Third subset - GROUP BY (), instead A,B,C - NULL, one row - it's name
"GRAND TOTAL". - calculate over all subset without grouping

Also have function "GROUPING" it's function say about null - "real
null" (from table) or generated by "GROUP BY GROUPING SETS"

My point: this feature can implement over GROUP BY and UNION ALL
We can make decomposition of "GROUP BY GROUPING SETS( (A,B,C),(A),()
)" to select A,B,C fron table GROUP BY A,B,C .UNION ALL select
A,B,NULL from table group BY A,B UNION ALL NUll,NUll,NULL from table
group by();

So, it's very simple, don't require modification of executor and
callibrate cost - only parser and semantic anylysis,
'
So, ROLLUP(A1,...An) is alias to "GROUP BY GROUPING SETS(
(A1,...,An),(A1,...,An-1),... (An-1,An),(An),() ),
CUBE - analogue.

If this idea it's good - i can write code base on old patch
http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php or
from clean list (as you wish).

In future i know how to implement ROLLUP more optimal (executor
iterator) and use this ROLLUP for optimisation another GROUP BY,
GROUPING SETS.

Thanks.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2009-05-10 13:21:36 Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Previous Message Guillaume Smet 2009-05-10 12:52:58 Re: SQL state in log_line_prefix