Quick Links

Why percent_rank is so slower than rank?

From:	Jie Li <jay23jack(at)gmail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Why percent_rank is so slower than rank?
Date:	2010-12-09 07:26:09
Message-ID:	AANLkTi=Jxs_mG4kJuxEWLdKvwREHYBeSA40AYnXqYhtq@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi all,

I'm new to window functions. Recently I run some simple queries but
surprised to find percent_rank is so slower than rank, could anybody tell me
why?

The rank query result:
test=# explain analyze select inv_date_sk,inv_item_sk, rank()over(partition
by inv_date_sk order by inv_item_sk) from inventory1;
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------
WindowAgg (cost=19563.99..23343.99 rows=189000 width=8) (actual
time=631.947..1361.158 rows=189000 loops=1)
-> Sort (cost=19563.99..20036.49 rows=189000 width=8) (actual
time=631.924..771.990 rows=189000 loops=1)
Sort Key: inv_date_sk, inv_item_sk
Sort Method: quicksort Memory: 12218kB
-> Seq Scan on inventory1 (cost=0.00..3000.00 rows=189000
width=8) (actual time=0.055..198.948 rows=189000 loops=1)
Total runtime: 1500.193 ms
(6 rows)

The percent_rank result:
test=# explain analyze select inv_date_sk,inv_item_sk,
percent_rank()over(partition by inv_date_sk order by inv_item_sk) from
inventory1;
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------
WindowAgg (cost=19563.99..23343.99 rows=189000 width=8) (actual
time=766.432..32924.804 rows=189000 loops=1)
-> Sort (cost=19563.99..20036.49 rows=189000 width=8) (actual
time=756.320..905.407 rows=189000 loops=1)
Sort Key: inv_date_sk, inv_item_sk
Sort Method: quicksort Memory: 12218kB
-> Seq Scan on inventory1 (cost=0.00..3000.00 rows=189000
width=8) (actual time=0.102..224.607 rows=189000 loops=1)
Total runtime: 33152.188 ms
(6 rows)

One special thing is that all the values of the partition key(inv_date_sk)
are the same, that is, there is only one window partition. I find that
percent_rank needs to buffer all the tuples to get the total number of rows.
But why is it so expensive?

I use 8.4.4. And I only increase the work_mem to 100M and leave other
parameters untouched.

Thanks,
Li Jie

Responses

Re: Why percent_rank is so slower than rank? at 2010-12-09 21:01:23 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2010-12-09 07:52:11	Re: Hot Standby btree delete records and vacuum_defer_cleanup_age
Previous Message	Jeff Janes	2010-12-09 05:44:13	Re: BufFreelistLock