Re: [PATCH] Negative Transition Aggregate Functions (WIP)

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Florian Pflug <fgp(at)phlo(dot)org>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Date: 2014-04-10 06:50:03
Message-ID: CAApHDvqwOAN-phRtCTiCJXdr1+ShkkSz=6pJn-hijO=z4EMEaw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 10, 2014 at 9:55 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Florian Pflug <fgp(at)phlo(dot)org> writes:
> > I was (and still am) not in favour of duplicating the whole quadruple of
> > (state, initialvalue, transferfunction, finalfunction) because it seems
> > excessive. In fact, I believed that doing this would probably be grounds
> for
> > outright rejection of the patch, on the base of catalog bloat. And your
> > initial response to this suggestion seemed to confirm this.
>
> Well, I think it's much more likely that causing a performance penalty for
> cases unrelated to window aggregates would lead to outright rejection :-(.
> The majority of our users probably don't ever use window functions, but
> for sure they've heard of SUM(). We can't penalize the non-window case.
>
> Expanding pg_aggregate from 10 columns (as per patch) to 14 (as per this
> suggestion) is a little annoying but it doesn't sound like a show stopper.
> It seems reasonable to assume that the extra initval would be NULL in most
> cases, so it's probably a net addition of 12 bytes per row.
>
>
I also wouldn't imagine that the overhead of storing that would be too
great... And are there really any databases out there that have 1000's of
custom aggregate functions?

I'm actually quite glad to see someone agrees with me on this. I think it
opens up quite a bit of extra optimisation opportunities with things like
MAX and MIN... In these cases we could be tracking the number of values of
max found and reset it when we get a bigger value. That way we could report
the inverse transition as successful if maxcount is still above 0 after the
removal of a max value... Similar to how I implemented the inverse
transition for sum(numeric). In fact doing it this way would mean that
inverse transitions for sum(numeric) would never fail and retry. I just
thought we had gotten to a stage of not requiring this due to the overheads
being so low... I was quite surprised to see the count tracking account for
5% for sum int. What I don't quite understand yet is why we can't just
create a new function for int inverse transitions instead of borrowing the
inverse transition functions for avg...?

Regards

David Rowley

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message sachin kotwal 2014-04-10 07:52:42 Re: WAL replay bugs
Previous Message David Rowley 2014-04-10 06:35:51 Re: [PATCH] Negative Transition Aggregate Functions (WIP)