Re: [PATCH] Negative Transition Aggregate Functions (WIP)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Florian Pflug <fgp(at)phlo(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Date: 2014-04-09 19:35:31
Message-ID: 24147.1397072131@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> ... an invertible aggregate may require a more
> complex transition state data structure --- in particular, if you're
> forced to go from a pass-by-value to a pass-by-reference data type, right
> there you are going to take a big hit in aggregate performance, and there
> is no way for the forward transition function to avoid it. The patch
> has in fact already done that to a couple of basic aggregates like
> sum(int4). Has anyone bothered to test what side-effects that has on
> non-windowed aggregation performance?

As a quick check, I compared aggregation performance in HEAD, non-assert
builds, with and without --disable-float8-byval on a 64-bit machine.
So this tests replacing a pass-by-val transition datatype with a
pass-by-ref one without any other changes. There's essentially no
difference in performance of sum(int4), AFAICT, but that's because
int4_sum goes out of its way to cheat and avoid palloc overhead.
I looked to the bit_and() aggregates to see what would happen to
an aggregate not thus optimized. As expected, int4 and int8 bit_and
are just about the same speed if int8 is pass by value ... but if it's
pass by ref, the int8 case is a good 60% slower.

So added palloc overhead, at least, is a no-go. I see that the patched
version of sum(int4) avoids that trap, but nonetheless it's replaced a
pretty cheap transition function with a less cheap function, namely the
function previously used for avg(int4). A quick test says that avg(int4)
is about five percent slower than sum(int4), so that's the kind of hit
we'd be taking on non-windowed aggregations if we do it like this.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-04-09 20:35:07 Re: WIP patch (v2) for updatable security barrier views
Previous Message Vik Fearing 2014-04-09 18:34:00 Buildfarm network under attack