slow count in window query

Lists: pgsql-hackers
From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: slow count in window query
Date: 2009-07-15 10:18:54
Message-ID: 162867790907150318y3dccb77ep1ba6eb94742e72c8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I did some test - median via window function - I found probably some
bad optimised code. I found two methods - Celko and Itzik Ben-Gan.
Ben-Gan methoud should to be faster - there is one sort less, but in
practice - it is 2 times slower.

create table x(a integer);
insert into x select (random()*10000)::int from generate_series(1,10000);

Celko method:
postgres=# explain select avg(a)
from (select a, row_number() over
(order by a asc) as hi,
row_number()
over (order by a desc) as lo,
from x) s
where hi in (lo-1,lo+1);
QUERY PLAN
-------------------------------------------------------------------------------------------------
Aggregate (cost=2144.02..2144.03 rows=1 width=4)
-> Subquery Scan s (cost=1643.77..2143.77 rows=100 width=4)
Filter: ((s.hi = (s.lo - 1)) OR (s.hi = (s.lo + 1)))
-> WindowAgg (cost=1643.77..1943.77 rows=10000 width=4)
-> WindowAgg (cost=1643.77..1818.77 rows=10000 width=4)
-> Sort (cost=1643.77..1668.77 rows=10000 width=4)
Sort Key: x.a
-> WindowAgg (cost=804.39..979.39
rows=10000 width=4)
-> Sort (cost=804.39..829.39
rows=10000 width=4)
Sort Key: x.a
-> Seq Scan on x
(cost=0.00..140.00 rows=10000 width=4)
(11 rows)

Ben-Gan:

postgres=# explain select avg(a)
from (select a, row_number() over
(order by a) as r,
count(*) over () as rc
from x ) p
where r in ((rc+1)/2,(rc+2)/2) ;
QUERY PLAN
-------------------------------------------------------------------------------------
Aggregate (cost=1354.64..1354.65 rows=1 width=4)
-> Subquery Scan p (cost=804.39..1354.39 rows=100 width=4)
Filter: ((p.r = ((p.rc + 1) / 2)) OR (p.r = ((p.rc + 2) / 2)))
-> WindowAgg (cost=804.39..1104.39 rows=10000 width=4)
-> WindowAgg (cost=804.39..979.39 rows=10000 width=4)
-> Sort (cost=804.39..829.39 rows=10000 width=4)
Sort Key: x.a
-> Seq Scan on x (cost=0.00..140.00
rows=10000 width=4)
(8 rows)

but
postgres=# select avg(a) from (select a, row_number() over (order by
a) as r, count(*) over () as rc from x ) p where r in
((rc+1)/2,(rc+2)/2) ;
avg
-----------------------
5027.0000000000000000
(1 row)

Time: 179,310 ms

postgres=# select avg(a) from (select a, row_number() over (order by a
asc) as hi, row_number() over (order by a desc) as lo from x) s where
hi in (lo-1,lo+1);
avg
-----------------------
5027.0000000000000000
(1 row)

Time: 78,791 ms

When I checked diff, I found, so the problem is count() function.

count(*) over () is very slow. - maybe so this is standard aggregate?

Regards
Pavel Stehule


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: slow count in window query
Date: 2009-07-16 11:02:39
Message-ID: 407d949e0907160402y26bb728agabcbf2b27d9d6a43@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jul 15, 2009 at 11:18 AM, Pavel Stehule<pavel(dot)stehule(at)gmail(dot)com> wrote:
> postgres=# select avg(a) from (select a, row_number() over (order by
> a) as r, count(*) over () as rc from x ) p where r in
> ((rc+1)/2,(rc+2)/2) ;

How does this compare to the plain non-windowing SQL implementation:

select a from x order by a offset (select trunc(count(*)/2) from x) limit 1

(except that that only works if count(*) is odd).

Interestingly finding the median is actually O(n) using Quickselect.
Maybe we should provide a C implementation of quickselect as a window
function. I'm not sure how to wedge in the concept that the sort is
unnecessary even though the ORDER BY is specified though.

I'm also not sure how to handle this if the set has to be spooled to
disk. Quicksort and Quickselect do a lot of scans throught he data and
wouldn't perform well on disk.

--
greg
http://mit.edu/~gsstark/resume.pdf


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: slow count in window query
Date: 2009-07-16 11:07:03
Message-ID: 162867790907160407u6bb86ac4n5e4bfacd674a8234@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I'm also not sure how to handle this if the set has to be spooled to
> disk. Quicksort and Quickselect do a lot of scans throught he data and
> wouldn't perform well on disk.

I thing, so problem is in aggregate func used as window func - or some
missing optimalisation.

when I replaced count(*) over () by subselect (SELECT count(*) FROM
...) then I got expected speed.

Pavel

>
> --
> greg
> http://mit.edu/~gsstark/resume.pdf
>


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: slow count in window query
Date: 2009-07-16 13:05:52
Message-ID: e08cc0400907160605m54ad9843p7f76fcb71eb26acf@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/7/16 Greg Stark <gsstark(at)mit(dot)edu>:
> On Wed, Jul 15, 2009 at 11:18 AM, Pavel Stehule<pavel(dot)stehule(at)gmail(dot)com> wrote:
>> postgres=# select avg(a) from (select a, row_number() over (order by
>> a) as r, count(*) over () as rc from x ) p where r in
>> ((rc+1)/2,(rc+2)/2) ;
>
> How does this compare to the plain non-windowing SQL implementation:
>
> select a from x order by a offset (select trunc(count(*)/2) from x) limit 1
>
> (except that that only works if count(*) is odd).
>
> Interestingly finding the median is actually O(n) using Quickselect.
> Maybe we should provide a C implementation of quickselect as a window
> function. I'm not sure how to wedge in the concept that the sort is
> unnecessary even though the ORDER BY is specified though.

median() should be aggregate, not window function, shouldn't it?

>
> I'm also not sure how to handle this if the set has to be spooled to
> disk. Quicksort and Quickselect do a lot of scans throught he data and
> wouldn't perform well on disk.

The WindowAgg spools rows into the tuplestore, which holds the data in
memory as far as it fits in. Do you have any idea how it stores
millons of millions of rows without tuplestore?

Regards,

--
Hitoshi Harada


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: slow count in window query
Date: 2009-07-16 13:40:32
Message-ID: e08cc0400907160640s358187cj4fc156eef190784b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/7/16 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>> I'm also not sure how to handle this if the set has to be spooled to
>> disk. Quicksort and Quickselect do a lot of scans throught he data and
>> wouldn't perform well on disk.
>
> I thing, so problem is in aggregate func used as window func - or some
> missing optimalisation.
>
> when I replaced count(*) over () by subselect (SELECT count(*) FROM
> ...) then I got expected speed.
>

WindowAgg always spools its input in the buffer though (in your case)
it throws away row by row, so compared with pure aggregate it has
overhead. I think this is reasonable approach for large data situation
and different type of window. But yes, we must improve the current
model.

1) There should be some kind of lightweight approach for such
small-data/simple-window situations.

2) tuplestore_puttupleslot() seems to me heavy (copy, check, etc) even
if the data fits in the memory by triming rows. We want to have more
flexible temporary storage on the fly.

Regards,

--
Hitoshi Harada


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: slow count in window query
Date: 2009-07-16 16:06:36
Message-ID: 162867790907160906n588a9292wcfc85cc2aee68c0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/7/16 Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>:
> 2009/7/16 Greg Stark <gsstark(at)mit(dot)edu>:
>> On Wed, Jul 15, 2009 at 11:18 AM, Pavel Stehule<pavel(dot)stehule(at)gmail(dot)com> wrote:
>>> postgres=# select avg(a) from (select a, row_number() over (order by
>>> a) as r, count(*) over () as rc from x ) p where r in
>>> ((rc+1)/2,(rc+2)/2) ;
>>
>> How does this compare to the plain non-windowing SQL implementation:
>>
>> select a from x order by a offset (select trunc(count(*)/2) from x) limit 1
>>
>> (except that that only works if count(*) is odd).
>>
>> Interestingly finding the median is actually O(n) using Quickselect.
>> Maybe we should provide a C implementation of quickselect as a window
>> function. I'm not sure how to wedge in the concept that the sort is
>> unnecessary even though the ORDER BY is specified though.
>
> median() should be aggregate, not window function, shouldn't it?
>
yes - the core of my topic is significant slowness query, that use
window functions, when aggregate function was used. This case could be
simply optimized.

This case isn't important for me. Simply I played with w.f. and I
found Celko's query - and I was surprised, because this query was
faster, then other - I expected some else.

>>
>> I'm also not sure how to handle this if the set has to be spooled to
>> disk. Quicksort and Quickselect do a lot of scans throught he data and
>> wouldn't perform well on disk.
>
> The WindowAgg spools rows into the tuplestore, which holds the data in
> memory as far as it fits in. Do you have any idea how it stores
> millons of millions of rows without tuplestore?
>
> Regards,
>
>
> --
> Hitoshi Harada
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: slow count in window query
Date: 2009-07-17 04:12:39
Message-ID: 162867790907162112j6bf48e92wdcc0934ae2e1fc39@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

look on:
postgres=# explain select count(*) over () from x;
QUERY PLAN
-------------------------------------------------------------
WindowAgg (cost=0.00..265.00 rows=10000 width=0)
-> Seq Scan on x (cost=0.00..140.00 rows=10000 width=0)
(2 rows)

Time: 1,473 ms
postgres=# explain select count(*) over (order by a) from x;
QUERY PLAN
------------------------------------------------------------------------
WindowAgg (cost=0.00..556.25 rows=10000 width=4)
-> Index Scan using gg on x (cost=0.00..406.25 rows=10000 width=4)
(2 rows)

but
query1: 160ms
query2: 72ms

regards
Pavel Stehule


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: slow count in window query
Date: 2009-07-17 07:19:53
Message-ID: e08cc0400907170019l2bf9fdbdj328ea0ddd669def@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/7/17 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> Hello
>
> look on:
> postgres=# explain select count(*) over () from x;
>                         QUERY PLAN
> -------------------------------------------------------------
>  WindowAgg  (cost=0.00..265.00 rows=10000 width=0)
>   ->  Seq Scan on x  (cost=0.00..140.00 rows=10000 width=0)
> (2 rows)
>
> Time: 1,473 ms
> postgres=# explain select count(*) over (order by a) from x;
>                               QUERY PLAN
> ------------------------------------------------------------------------
>  WindowAgg  (cost=0.00..556.25 rows=10000 width=4)
>   ->  Index Scan using gg on x  (cost=0.00..406.25 rows=10000 width=4)
> (2 rows)
>
> but
> query1: 160ms
> query2: 72ms

Well, how about "select a from x order by a"?
I wonder if index scan affects more than windowagg.

--
Hitoshi Harada


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com>
Cc: "Greg Stark" <gsstark(at)mit(dot)edu>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: slow count in window query
Date: 2009-07-17 14:34:09
Message-ID: 4A6045910200002500028902@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:

> postgres=# explain select count(*) over () from x;

> WindowAgg (cost=0.00..265.00 rows=10000 width=0)
> -> Seq Scan on x (cost=0.00..140.00 rows=10000 width=0)

> postgres=# explain select count(*) over (order by a) from x;

> WindowAgg (cost=0.00..556.25 rows=10000 width=4)
> -> Index Scan using gg on x (cost=0.00..406.25 rows=10000
width=4)

> query1: 160ms
> query2: 72ms

EXPLAIN ANALYZE is more telling than just EXPLAIN.

Did you run both several times or flush caches carefully between the
runs to eliminate caching effects?

Is it possible that there are a lot of dead rows in the table (from
UPDATEs or DELETEs), and the table has been vacuumed? (Output from
VACUUM VERBOSE on the table would show that.)

-Kevin


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: slow count in window query
Date: 2009-07-17 14:43:37
Message-ID: 162867790907170743h3bb2a62atc9e9798a6152a687@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/7/17 Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>
>> postgres=# explain select count(*) over () from x;
>
>>  WindowAgg  (cost=0.00..265.00 rows=10000 width=0)
>>    ->  Seq Scan on x  (cost=0.00..140.00 rows=10000 width=0)
>
>> postgres=# explain select count(*) over (order by a) from x;
>
>>  WindowAgg  (cost=0.00..556.25 rows=10000 width=4)
>>    ->  Index Scan using gg on x  (cost=0.00..406.25 rows=10000
> width=4)
>
>> query1: 160ms
>> query2: 72ms
>
> EXPLAIN ANALYZE is more telling than just EXPLAIN.

Query1

QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=931.50..931.51 rows=1 width=4) (actual
time=274.423..274.425 rows=1 loops=1)
-> Subquery Scan p (cost=0.00..931.25 rows=100 width=4) (actual
time=220.220..274.388 rows=2 loops=1)
Filter: ((p.r = ((p.rc + 1) / 2)) OR (p.r = ((p.rc + 2) / 2)))
-> WindowAgg (cost=0.00..681.25 rows=10000 width=4) (actual
time=120.622..247.618 rows=10000 loops=1)
-> WindowAgg (cost=0.00..556.25 rows=10000 width=4)
(actual time=0.088..89.848 rows=10000 loops=1)
-> Index Scan using gg on x (cost=0.00..406.25
rows=10000 width=4) (actual time=0.066..33.962 rows=10000 loops
Total runtime: 274.934 ms
(7 rows)

query2:

postgres=# explain analyze select avg(a) from (select a, row_number()
over (order by a asc) as hi, row_number() over (order by a desc) as lo
from x) s where hi in (lo-1,lo+1);

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1595.89..1595.90 rows=1 width=4) (actual
time=215.101..215.103 rows=1 loops=1)
-> Subquery Scan s (cost=1220.63..1595.63 rows=100 width=4)
(actual time=175.159..215.073 rows=1 loops=1)
Filter: ((s.hi = (s.lo - 1)) OR (s.hi = (s.lo + 1)))
-> WindowAgg (cost=1220.63..1395.63 rows=10000 width=4)
(actual time=136.985..191.231 rows=10000 loops=1)
-> Sort (cost=1220.63..1245.63 rows=10000 width=4)
(actual time=136.970..151.905 rows=10000 loops=1)
Sort Key: x.a
Sort Method: quicksort Memory: 686kB
-> WindowAgg (cost=0.00..556.25 rows=10000
width=4) (actual time=0.078..106.927 rows=10000 loops=1)
-> Index Scan using gg on x
(cost=0.00..406.25 rows=10000 width=4) (actual time=0.058..33.594
rows=10000
Total runtime: 215.845 ms
(10 rows)

>
> Did you run both several times or flush caches carefully between the
> runs to eliminate caching effects?

yes, - in both variants data was read from cache.

>
> Is it possible that there are a lot of dead rows in the table (from
> UPDATEs or DELETEs), and the table has been vacuumed?  (Output from
> VACUUM VERBOSE on the table would show that.)
>

table was filled with random numbers and analyzed - you can simple
check it - look on begin of the thread. This table wasn't updated.

Pavel

> -Kevin
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: slow count in window query
Date: 2009-07-17 14:48:33
Message-ID: 162867790907170748k59efd591wd47703d780b4bf65@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/7/17 Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>:
> 2009/7/17 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>> Hello
>>
>> look on:
>> postgres=# explain select count(*) over () from x;
>>                         QUERY PLAN
>> -------------------------------------------------------------
>>  WindowAgg  (cost=0.00..265.00 rows=10000 width=0)
>>   ->  Seq Scan on x  (cost=0.00..140.00 rows=10000 width=0)
>> (2 rows)
>>
>> Time: 1,473 ms
>> postgres=# explain select count(*) over (order by a) from x;
>>                               QUERY PLAN
>> ------------------------------------------------------------------------
>>  WindowAgg  (cost=0.00..556.25 rows=10000 width=4)
>>   ->  Index Scan using gg on x  (cost=0.00..406.25 rows=10000 width=4)
>> (2 rows)
>>
>> but
>> query1: 160ms
>> query2: 72ms
>
> Well, how about "select a from x order by a"?
> I wonder if index scan affects more than windowagg.
>

select a from x -> 42ms
select a from x order by a -> 50ms

all data are from cache.

> --
> Hitoshi Harada
>


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
Cc: "Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com>, "Greg Stark" <gsstark(at)mit(dot)edu>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: slow count in window query
Date: 2009-07-17 16:00:03
Message-ID: 4A6059B30200002500028918@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:

> table was filled with random numbers and analyzed - you can simple
> check it - look on begin of the thread. This table wasn't updated.

Confirmed. The ORDER BY consistently speeds up the query. Odd....

Sort speed varied based on random sequence generated, but typical
plan and timings:

test=# explain analyze select count(*) over () from x;
WindowAgg (cost=0.00..229.00 rows=10000 width=0) (actual
time=32.435..97.448 rows=10000 loops=1)
-> Seq Scan on x (cost=0.00..104.00 rows=10000 width=0) (actual
time=0.007..14.818 rows=10000 loops=1)
Total runtime: 112.526 ms

test=# explain analyze select count(*) over (order by a) from x;
WindowAgg (cost=768.39..943.39 rows=10000 width=4) (actual
time=34.982..87.803 rows=10000 loops=1)
-> Sort (cost=768.39..793.39 rows=10000 width=4) (actual
time=34.962..49.533 rows=10000 loops=1)
Sort Key: a
Sort Method: quicksort Memory: 491kB
-> Seq Scan on x (cost=0.00..104.00 rows=10000 width=4)
(actual time=0.006..14.682 rows=10000 loops=1)
Total runtime: 102.023 ms

-Kevin


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: slow count in window query
Date: 2009-07-18 04:40:53
Message-ID: e08cc0400907172140h118d91eflaaa52547a7c37786@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/7/18 Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>
>> table was filled with random numbers and analyzed - you can simple
>> check it - look on begin of the thread. This table wasn't updated.
>
> Confirmed.  The ORDER BY consistently speeds up the query.  Odd....
>
> Sort speed varied based on random sequence generated, but typical
> plan and timings:

Kevin's result is quite odd. I confirmed that using IndexScan looked
fater in Pavel's result but yours is with Sort node.

I found that those results are seen in relatively small set. I
increased the source table up to 100000 rows and the OVER (ORDER BY a)
case got slower.

What really suprised me is in any case without ORDER BY clause in the
window, WindowAgg node starts quite later than the lower node
finishes.

> test=# explain analyze select count(*) over () from x;
> WindowAgg (cost=0.00..229.00 rows=10000 width=0) (actual
> time=32.435..97.448 rows=10000 loops=1)
> -> Seq Scan on x (cost=0.00..104.00 rows=10000 width=0) (actual
> time=0.007..14.818 rows=10000 loops=1)
> Total runtime: 112.526 ms

I had thought WindowAgg actual time would be 14.xxx ... 97.448 but
actually 32.435 ....97.448. ORDER BY case returns the first result as
soon as underneath Sort (or IndexScan) returns the first (actually the
second), because window frame has only a row. But even the frame
contains all the row (i.e. OVER() case) can return the first row not
so later than the underneath node returns the last.

If I understand exlain analyze correctly and it tells us the fact,
WindowAgg without ORDER BY clause gets unreasonably slow. Let me see.

Regards,

--
Hitoshi Harada


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: slow count in window query
Date: 2009-07-29 17:20:17
Message-ID: e08cc0400907291020v7381fa35j18dc1eccd01352c7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/7/18 Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>:
> If I understand exlain analyze correctly and it tells us the fact,
> WindowAgg without ORDER BY clause gets unreasonably slow. Let me see.
>

I haven't determined the difference between with and without ORDER BY
clause in OVER(), but I took a benchmark that throws an interesting
result.

$ bin/psql regression -c 'explain analyze select count(*) over() from x'
QUERY PLAN

--------------------------------------------------------------------------------
--------------------------------
WindowAgg (cost=0.00..2741.00 rows=100000 width=0) (actual time=3725.294..4559
.828 rows=100000 loops=1)
-> Seq Scan on x (cost=0.00..1491.00 rows=100000 width=0) (actual time=0.11
2..310.349 rows=100000 loops=1)
Total runtime: 4811.115 ms
(3 rows)

The query is quite slow because profiling hook function calls
gettimeofday() each time. And here's the result that counted up
eval_windowaggregate() call and its children functions. Elapse time is
in second and it is subtracted with total gettimeofday() overhead.

eval_windowaggregates:
Count 100000
Elapse 0.588426

Address |Name |Count |Elapse(Total)
0x8204067|initialize_windowaggregate | 1| 0.000277
0x8204d4a|spool_tuples |100002| 0.620092
0x83dcd08|tuplestore_select_read_pointer|100001| 0.011080
0x83dda2f|tuplestore_gettupleslot |100001| 0.049005
0x8204fdd|row_is_in_frame |100000| 0.014978
0x8204168|advance_windowaggregate |100000| 0.025675
0x81ead8a|ExecClearTuple |100000| 0.022105
0x8204462|finalize_windowaggregate | 1| 0.000015
0x8204120|MemoryContextSwitchTo | 2| 0.000000

spool_tuples() is dominant in eval_windowaggregates(). I think it is
not needed if the query contains only simple aggregate like count(*)
OVER () but currently we copy all the rows from the source table to
tuplestore. Even if it fits in memory, the copy operation costs too
much.

I am thinking about how to avoid unnecessary copy overhead...

Regards,

---
Hitoshi Harada