Re: WITHIN GROUP patch

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: WITHIN GROUP patch
Date: 2014-01-07 22:46:28
Message-ID: 874n5fzagi.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

>> Initial tests suggest that your version is ~40% slower than ours on
>> some workloads.

Tom> I poked at this a bit with perf and oprofile, and concluded that
Tom> probably the difference comes from ordered_set_startup()
Tom> repeating lookups for each group that could be done just once
Tom> per query.

Retesting with your changes shows that the gap is down to 15% but still
present:

work_mem=64MB enable_hashagg=off (for baseline test)

baseline query (333ms on both versions):
select count(*)
from (select j from generate_series(1,3) i,
generate_series(1,100000) j group by j) s;

test query:
select count(*)
from (select percentile_disc(0.5) within group (order by i)
from generate_series(1,3) i,
generate_series(1,100000) j group by j) s;

On the original patch as supplied: 571ms - 333ms = 238ms
On current master: 607ms - 333ms = 274ms

Furthermore, I can't help noticing that the increased complexity has
now pretty much negated your original arguments for moving so much of
the work out of nodeAgg.c.

--
Andrew (irc:RhodiumToad)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-01-07 23:27:33 Re: WITHIN GROUP patch
Previous Message Kevin Grittner 2014-01-07 22:27:38 Re: Re: How to reproduce serialization failure for a read only transaction.