Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Joshua Tolley <eggyknap(at)gmail(dot)com>
Cc: Олег Царев <zabivator(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Date: 2009-05-13 04:29:41
Message-ID: 162867790905122129y1843812ayaf0fca009f80dfce@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2009/5/13 Joshua Tolley <eggyknap(at)gmail(dot)com>:
> On Tue, May 12, 2009 at 11:20:14PM +0200, Pavel Stehule wrote:
>> this patch has some bugs but it is good prototype (it's more stable
>> than old patch):
>
> I'm not sure if you're at the point that you're interested in bug reports, but
> here's something that didn't behave as expected:
>
> 5432 josh(at)josh*# create table gsettest (prod_id integer, cust_id integer,
> quantity integer);
> CREATE TABLE
> 5432 josh(at)josh*# insert into gsettest select floor(random() * 10)::int,
> floor(random() * 20)::int, floor(random() * 10)::int from generate_series(1,
> 100);
> INSERT 0 100
> 5432 josh(at)josh*# select prod_id, cust_id, sum(quantity) from gsettest group by
> cube (prod_id, cust_id) order by 1, 2;
>  prod_id | cust_id | sum
> ---------+---------+-----
>       5 |       7 |   4
>       8 |      16 |   3
>       9 |      19 |   8
>       4 |      13 |   3
>       8 |       8 |  15
>       5 |       2 |   4
>       7 |       6 |   7
>       6 |       6 |   3
> </snip>
>
> Note that the results aren't sorted. The following, though, works around it:

I thing, so result should not be sorted - it's same like normal group by.

regards
Pavel Stehule

>
> 5432 josh(at)josh*# select * from (select prod_id, cust_id, sum(quantity) from
> gsettest group by cube (prod_id, cust_id)) f order by 1, 2;
>  prod_id | cust_id | sum
> ---------+---------+-----
>       0 |       2 |   8
>       0 |       4 |   8
>       0 |       5 |   2
>       0 |       7 |  11
>       0 |       8 |   7
>       0 |       9 |   1
>       0 |      12 |   3
>       0 |      14 |   7
>       0 |      16 |   5
>       0 |      17 |   8
>       0 |      18 |   9
>       0 |      19 |   2
>       0 |         |  71
> </snip>
>
> EXPLAIN output is as follows:
> 5432 josh(at)josh*# explain select prod_id, cust_id, sum(quantity) from gsettest
> group by cube (prod_id, cust_id) order by 1, 2;
>                                QUERY PLAN
> ---------------------------------------------------------------------------
>  Append  (cost=193.54..347.71 rows=601 width=9)
>   CTE **g**
>     ->  Sort  (cost=135.34..140.19 rows=1940 width=12)
>           Sort Key: gsettest.prod_id, gsettest.cust_id
>           ->  Seq Scan on gsettest  (cost=0.00..29.40 rows=1940 width=12)
>   ->  HashAggregate  (cost=53.35..55.85 rows=200 width=12)
>         ->  CTE Scan on "**g**"  (cost=0.00..38.80 rows=1940 width=12)
>   ->  HashAggregate  (cost=48.50..51.00 rows=200 width=8)
>         ->  CTE Scan on "**g**"  (cost=0.00..38.80 rows=1940 width=8)
>   ->  HashAggregate  (cost=48.50..51.00 rows=200 width=8)
>         ->  CTE Scan on "**g**"  (cost=0.00..38.80 rows=1940 width=8)
>   ->  Aggregate  (cost=43.65..43.66 rows=1 width=4)
>         ->  CTE Scan on "**g**"  (cost=0.00..38.80 rows=1940 width=4)
> (13 rows)
>
> ...and without the ORDER BY clause just to prove that it really is the reason
> for the Sort step...
>
> 5432 josh(at)josh*# explain select prod_id, cust_id, sum(quantity) from gsettest
> group by cube (prod_id, cust_id);
>                               QUERY PLAN
> ------------------------------------------------------------------------
>  Append  (cost=82.75..236.92 rows=601 width=9)
>   CTE **g**
>     ->  Seq Scan on gsettest  (cost=0.00..29.40 rows=1940 width=12)
>   ->  HashAggregate  (cost=53.35..55.85 rows=200 width=12)
>         ->  CTE Scan on "**g**"  (cost=0.00..38.80 rows=1940 width=12)
>   ->  HashAggregate  (cost=48.50..51.00 rows=200 width=8)
>         ->  CTE Scan on "**g**"  (cost=0.00..38.80 rows=1940 width=8)
>   ->  HashAggregate  (cost=48.50..51.00 rows=200 width=8)
>         ->  CTE Scan on "**g**"  (cost=0.00..38.80 rows=1940 width=8)
>   ->  Aggregate  (cost=43.65..43.66 rows=1 width=4)
>         ->  CTE Scan on "**g**"  (cost=0.00..38.80 rows=1940 width=4)
> (11 rows)
>
> I'm hoping I'll get a chance to poke at the patch some. This could be very
> useful...
>
> - Josh / eggyknap
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
>
> iEYEARECAAYFAkoKOdUACgkQRiRfCGf1UMOpFQCeJGQftMheSi6blMwheK4HI89p
> E7cAnjdWi4FaerR/+RTBeSv9Zc0RRXQ3
> =xW04
> -----END PGP SIGNATURE-----
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Asko Oja 2009-05-13 04:56:24 Re: display previous query string of idle-in-transaction
Previous Message Joshua Tolley 2009-05-13 03:09:09 Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)