Confusing documentation of ordered-set aggregates?

From: Florian Pflug <fgp(at)phlo(dot)org>
To: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Confusing documentation of ordered-set aggregates?
Date: 2014-01-23 02:09:48
Message-ID: 95165C04-FCBE-4AFF-AF7E-17A1C6A66DC1@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

After reading through the relevant parts of sytnax.sgml, create_aggregate.smgl
and xaggr.sgml, I think I understand how these work - they work exactly like
regular aggregates, except that some arguments are evaluated only once and
passed to the final function instead of the transition function. The whole
"ORDER BY" thing is just crazy syntax the standard mandates - a saner
alternative would have been

ordered_set_agg(direct1,...,directN, WITHIN(arg1,...,argM))

or something like that, right?

So whether "ORDER BY" implies any actual ordering is up to the ordered-set
aggregate's final function. Or at least that's what xaggr.sgml seems to say

Unlike the case for normal aggregates, the sorting of input rows for an
ordered-set aggregate is <emphasis>not</> done behind the scenes, but is
the responsibility of the aggregate's support functions.

but that seems to contradict syntax.sgml which says

The expressions in the <replaceable>order_by_clause</replaceable> are
evaluated once per input row just like normal aggregate arguments, sorted
as per the <replaceable>order_by_clause</replaceable>'s requirements, and
fed to the aggregate function as input arguments.

Also, xaggr.sgml has the following to explain why the NULLs are passed for all
aggregated arguments to the final function, instead of simply not passing them
at all

While the null values seem useless at first sight, they are important because
they make it possible to include the data types of the aggregated input(s) in
the final function's signature, which may be necessary to resolve the output
type of a polymorphic aggregate.

Why do ordered-set aggregates required that, when plain aggregates are fine
without it? array_agg(), for example, also has a result type that is
determined by the argument type, yet it's final function doesn't take an
argument of type anyelement, even though it returns anyarray.

best regards,
Florian Pflug

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian Pflug 2014-01-23 02:27:03 Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Previous Message Claudio Freire 2014-01-23 01:52:25 Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance