Re: [PATCH] Negative Transition Aggregate Functions (WIP)

From: Florian Pflug <fgp(at)phlo(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Date: 2014-01-09 17:44:18
Message-ID: 9EF989F8-AF06-42B2-89CC-C6FE7AF54D67@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jan9, 2014, at 18:09 , Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Florian Pflug <fgp(at)phlo(dot)org> writes:
>> For float 4 and float8, wasn't the consensus that the potential
>> lossy-ness of addition makes this impossible anyway, even without the
>> NaN issue? But...
>
> Well, that was my opinion, I'm not sure if it was consensus ;-).

I'd say your example showing how it could produce completely bogus
results was pretty convincing...

> But NaN is an orthogonal problem I think. I'm not sure whether it
> has analogues in other data types.

Transfer functions which are partially invertible are not that
uncommon, I'd say. Browsing through 9.3's list of aggregate functions,
the following come to mind

max()
Values smaller than the maximum can be removed, removing the current
maximum requires a rescan. By remembering the N largest values,
the number of required rescans can be reduced, but never fully
eliminated. Same works for min().

bool_or()
FALSE can be removed, removing TRUE requires a rescan. Could be made
fully invertible by counting the number of TRUE and FALSE values,
similar to my suggestion for how to handle NaN for sum(numeric).
Same works for bool_and().

bit_or()
Like boo_or(), 0 can be removed, everything else requires a rescan.
Same works for bit_and()

Plus, any aggregate with a strict transfer function would be in
exactly the same situation regarding NULL as sum(numeric) is regarding
NaN. AFAIK we don't have any such aggregate in core, though.

best regards,
Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-01-09 17:44:24 Re: integrate pg_upgrade analyze_new_cluster.sh into vacuumdb
Previous Message Robert Haas 2014-01-09 17:37:48 Re: integrate pg_upgrade analyze_new_cluster.sh into vacuumdb