Re: [PATCH] Negative Transition Aggregate Functions (WIP)

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <kgrittn(at)ymail(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Date: 2014-01-10 21:27:49
Message-ID: CAApHDvp7FdStiLsYR6qK+kyp3XeOhE=64Z5xftULK-_dLx9a_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 11, 2014 at 7:08 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Although, having said that ... maybe "build your own aggregate" would
> be a reasonable suggestion for people who need this? I grant that
> it's going to be a minority requirement, maybe even a small minority
> requirement. People who have the chops to get this sort of thing right
> can probably manage a custom aggregate definition.
>
>
I more or less wrote off the idea of inverse transition functions after
your example upthread. I had thought that perhaps if we could get inverse
transitions in there for SUM(numeric) then people who need more speed could
just cast their value to numeric then back to float or double precision
after aggregation takes place. I had to delay writing any documentation
around that as I'm still not sure if we can have sum(numeric) use an
inverse transition function due to the fact that it can introduce extra
zeros after the decimal point.

As the patch stands at the moment, I currently have a regression test which
currently fails due to these extra zeros after the decimal point:

-- This test currently fails due extra trailing 0 digits.
SELECT SUM(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND
UNBOUNDED FOLLOWING)
FROM (VALUES(1,1.01),(2,2),(3,3)) v(i,n);

Patched produces:
6.01
5.00
3.00
Unpatched produces:
6.01
5
3

With inverse transitions this query still produces correct results, it just
does not produces the numeric in the same format as it does without
performing inverse transitions. Personally I'd rather focus on trying to
get SUM(numeric) in there for 9.4 and maybe focus on floating point stuff
at a later date as casting to numeric can be the work around for users who
complain about the speed. Or if they really want they can create their own
aggregate, using an existing built in function as the inverse transition,
like float8_mi.

There's certain things that currently seem a big magical to me when it
comes to numeric, for example I've no idea why the following query produces
20 0's after the decimal point for 1 and only 16 for 2.

select n::numeric / 1 from generate_series(1,2) g(n);

To me it does not look very consistent at all and I'm really wondering if
there is some special reason why we bother including the useless zeros at
the end at all. I've written a patch which gets rid of them in numeric_out,
but I had not planned on posting it here in case it gets laughed off stage
due to some special reason we have for keeping those zeros that I don't
know about.

Can anyone explain to me why we have these unneeded zeros in numeric when
the precision is not supplied?

Regards

David Rowley

> The constraint this would pose on the float4 and float8 implementations
> is that it be possible to use their transition and final functions in
> a custom aggregate declaration while leaving off the inverse function;
> or, if that combination doesn't work for some reason, we have to continue
> to provide the previous transition/final functions for use in user
> aggregates.
>
> Suitable documentation would be needed too, of course.
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2014-01-10 21:34:35 Re: Time to do our Triage for 9.4
Previous Message Jim Nasby 2014-01-10 21:26:04 Re: Disallow arrays with non-standard lower bounds