Re: Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

From: David Rowley <dgrowley(at)gmail(dot)com>
To: Thomas Mayer <thomas(dot)mayer(at)student(dot)kit(dot)edu>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Window function optimisation, allow pushdowns of items matching PARTITION BY clauses
Date: 2014-04-14 11:03:31
Message-ID: CAHoyFK9c8m16opDnAKnKHU6ahdDcXsLefjYVCb_wW7ksbO8rww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 14 April 2014 02:50, Thomas Mayer <thomas(dot)mayer(at)student(dot)kit(dot)edu> wrote:

> Hello David,
>
> thanks for your work. The results look promising.
>

Thanks

>
> What I'm missing is a test case with multiple fields in the partition by
> clauses:
>
>
I've modified the patch and added some regression tests that I think cover
all of your cases, but please let me know if I've missed any. The patch
will follow very soon.

> -- should push down, because partid is part of all PARTITION BY clauses
> explain analyze select partid,n,m from (
> select partid,
> count(*) over (partition by partid) n,
> count(*) over (partition by partid, partid+0) m
> from winagg
> ) winagg
> where partid=1;
>
> current production 9.3.4 is returning
>
>
> QUERY PLAN
> ------------------------------------------------------------
> ------------------------------------------------------------
> ----------------------------------------------------------
> Subquery Scan on winagg (cost=350955.11..420955.11 rows=20 width=20)
> (actual time=2564.360..3802.413 rows=20 loops=1)
>
> Filter: (winagg.partid = 1)
> Rows Removed by Filter: 1999980
> -> WindowAgg (cost=350955.11..395955.11 rows=2000000 width=4) (actual
> time=2564.332..3657.051 rows=2000000 loops=1)
> -> Sort (cost=350955.11..355955.11 rows=2000000 width=4)
> (actual time=2564.320..2802.444 rows=2000000 loops=1)
> Sort Key: winagg_1.partid, ((winagg_1.partid + 0))
> Sort Method: external sort Disk: 50840kB
> -> WindowAgg (cost=0.43..86948.43 rows=2000000 width=4)
> (actual time=0.084..1335.081 rows=2000000 loops=1)
> -> Index Only Scan using winagg_partid_idx on winagg
> winagg_1 (cost=0.43..51948.43 rows=2000000 width=4) (actual
> time=0.051..378.232 rows=2000000 loops=1)
> Heap Fetches: 0
>
> "Index Only Scan" currently returns all rows (without pushdown) on current
> production 9.3.4. What happens with the patch you provided?
>
>
I get a push down as expected.

QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on winagg (cost=82.71..83.31 rows=20 width=20) (actual
time=0.168..0.179 rows=20 loops=1)
-> WindowAgg (cost=82.71..83.11 rows=20 width=4) (actual
time=0.166..0.174 rows=20 loops=1)
-> Sort (cost=82.71..82.76 rows=20 width=4) (actual
time=0.151..0.154 rows=20 loops=1)
Sort Key: ((winagg_1.partid + 0))
Sort Method: quicksort Memory: 17kB
-> WindowAgg (cost=4.58..82.28 rows=20 width=4) (actual
time=0.127..0.135 rows=20 loops=1)
-> Bitmap Heap Scan on winagg winagg_1
(cost=4.58..81.98 rows=20 width=4) (actual time=0.058..0.104 rows=20
loops=1)
Recheck Cond: (partid = 1)
Heap Blocks: exact=20
-> Bitmap Index Scan on winagg_partid_idx
(cost=0.00..4.58 rows=20 width=0) (actual time=0.037..0.037 rows=20
loops=1)
Index Cond: (partid = 1)
Planning time: 0.235 ms
Total runtime: 0.280 ms

> -- Already Part of your tests:
> -- should NOT push down, because partid is NOT part of all PARTITION BY
> clauses
> explain analyze select partid,n,m from (
> select partid,
> count(*) over (partition by partid) n,
> count(*) over (partition by partid+0) m
> from winagg
> ) winagg
> where partid=1;
>
> Reordering the fields should also be tested:
> -- should push down, because partid is part of all PARTITION BY clauses
> -- here: partid at the end
> explain analyze select partid,n,m from (
> select partid,
> count(*) over (partition by partid) n,
> count(*) over (partition by partid+0, partid) m
> from winagg
> ) winagg
> where partid=1;
>
>
Covered in regression and works as expected.

> -- should push down, because partid is part of all PARTITION BY clauses
> -- here: partid in the middle
> explain analyze select partid,n,m from (
> select partid,
> count(*) over (partition by partid) n,
> count(*) over (partition by partid+0, partid, partid+1) m
> from winagg
> ) winagg
> where partid=1;
>
>
I covered this in the regression tests too.

Regards

David Rowley

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2014-04-14 11:19:40 Re: Window function optimisation, allow pushdowns of items matching PARTITION BY clauses
Previous Message Simon Riggs 2014-04-14 09:46:18 Re: Custom Scan APIs (Re: Custom Plan node)