Re: WIP Patch for GROUPING SETS phase 1

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Subject: Re: WIP Patch for GROUPING SETS phase 1
Date: 2014-08-21 10:28:36
Message-ID: 87mwayqjv6.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


A progress update:

Atri> We envisage that handling of arbitrary grouping sets will be
Atri> best done by having the planner generating an Append of
Atri> multiple aggregation paths, presumably with some way of moving
Atri> the original input path to a CTE. We have not really explored
Atri> yet how hard this will be; suggestions are welcome.

This idea was abandoned.

Instead, we have implemented full support for arbitrary grouping sets
by means of a chaining system:

explain (verbose, costs off) select four, ten, hundred, count(*) from onek group by cube(four,ten,hundred);

QUERY PLAN
-----------------------------------------------------------------------------------------------------
GroupAggregate
Output: four, ten, hundred, count(*)
Grouping Sets: (onek.hundred, onek.four, onek.ten), (onek.hundred, onek.four), (onek.hundred), ()
-> Sort
Output: four, ten, hundred
Sort Key: onek.hundred, onek.four, onek.ten
-> ChainAggregate
Output: four, ten, hundred
Grouping Sets: (onek.ten, onek.hundred), (onek.ten)
-> Sort
Output: four, ten, hundred
Sort Key: onek.ten, onek.hundred
-> ChainAggregate
Output: four, ten, hundred
Grouping Sets: (onek.four, onek.ten), (onek.four)
-> Sort
Output: four, ten, hundred
Sort Key: onek.four, onek.ten
-> Seq Scan on public.onek
Output: four, ten, hundred
(20 rows)

The ChainAggregate nodes use a tuplestore to communicate with the
GroupAggregate node at the top of the chain; they pass through input
tuples unchanged, and write aggregated result rows to the tuplestore,
which the top node then returns once it has finished its own result.

The organization of the planner code seems to be actively hostile to
any attempt to break out new CTEs on the fly, or to plan parts of the
query more than once; the method above seems to be the easiest way to
avoid those issues.

Atri> At this point we are more interested in design review rather
Atri> than necessarily committing this patch in its current state.

This no longer applies; we expect to post within a day or two an
updated patch with full functionality.

--
Andrew (irc:RhodiumToad)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2014-08-21 10:35:41 Re: documentation update for doc/src/sgml/func.sgml
Previous Message Andreas 'ads' Scherbaum 2014-08-21 10:23:56 Re: documentation update for doc/src/sgml/func.sgml