diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index a12ee56..883fb50 100644
*** a/doc/src/sgml/catalogs.sgml
--- b/doc/src/sgml/catalogs.sgml
***************
*** 381,386 ****
--- 381,392 ----
Transition function
+ agginvtransfn
+ regproc
+ pg_proc.oid
+ Inverse transition function
+
+ aggfinalfnregprocpg_proc.oid
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 6e2fbda..59ce91e 100644
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*************** SELECT xmlagg(x) FROM (SELECT x FROM tes
*** 13310,13315 ****
--- 13310,13437 ----
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING>.
Other frame specifications can be used to obtain other effects.
+
+
+ Depending on the aggregate function, aggregating over frames starting
+ at a row relative to the current row can be drastically less efficient
+ than aggregating over frames aligned to the start of the partition. The
+ frame starts at a row relative to the current row if ORDER
+ BY is used together with any frame start clause other than
+ UNBOUNDED PRECEDING (which is the default). Then,
+ aggregates without a suitable inverse transition function
+ (see for details) will be
+ computed for each frame from scratch, instead of re-using the previous
+ frame's result, causing quadratic growth of the
+ execution time as the number of rows per partition increases. The table
+ list the built-in aggregate
+ functions affected by this. Note that quadratic growth is only a problem
+ if partitions contain many rows - for partitions with only a few rows,
+ even inefficient aggregates are unlikely to cause problems.
+
+
+
+
+ Aggregate Function Behaviour for frames not starting at
+ UNBOUNDED PRECEDING.
+
+
+
+
+
+
+ Aggregate Function
+ Input Type
+ Computed From Scratch
+
+
+
+
+
+
+
+ any numerical aggregate
+
+
+ float4
+ or
+ float8
+
+ always, to avoid error accumulation
+
+
+
+
+ any numerical aggregate
+
+
+ numeric
+
+ if the maximum number of decimal digits within the inputs changes
+
+
+
+
+ min
+
+
+ any
+
+ if some inputs in the old frame that aren't in the new frame were minimal
+
+
+
+
+ max
+
+
+ any
+
+ if some inputs in the old frame that aren't in the new frame were maximal
+
+
+
+
+ bit_and,bit_or
+
+
+ any
+
+ always
+
+
+
+
+ string_agg
+
+
+ text or
+ bytea or
+
+ if the delimiter lengths vary
+
+
+
+
+ xmlagg
+
+
+ xml
+
+ always
+
+
+
+
+ json_agg
+
+
+ json
+
+ always
+
+
+
+
diff --git a/doc/src/sgml/ref/create_aggregate.sgml b/doc/src/sgml/ref/create_aggregate.sgml
index e5fc718..9bdb2eb 100644
*** a/doc/src/sgml/ref/create_aggregate.sgml
--- b/doc/src/sgml/ref/create_aggregate.sgml
*************** CREATE AGGREGATE sfunc,
STYPE = state_data_type
[ , SSPACE = state_data_size ]
+ [ , INVSFUNC = inv_trans_func ]
[ , FINALFUNC = ffunc ]
[ , INITCOND = initial_condition ]
[ , SORTOP = sort_operator ]
*************** CREATE AGGREGATE sfunc,
STYPE = state_data_type
[ , SSPACE = state_data_size ]
+ [ , INVSFUNC = inv_trans_func ]
[ , FINALFUNC = ffunc ]
[ , INITCOND = initial_condition ]
[ , SORTOP = sort_operator ]
*************** CREATE AGGREGATE sfunc,
and an optional final calculation function
ffunc.
These are used as follows:
sfunc( internal-state, next-data-values ) ---> next-internal-state
ffunc( internal-state ) ---> aggregate-value
--- 86,103 ----
! An aggregate function is made from one, two or three ordinary
functions:
! a (forward) state transition function
sfunc,
+ an optional inverse state transition function
+ invsfunc,
and an optional final calculation function
ffunc.
These are used as follows:
sfunc( internal-state, next-data-values ) ---> next-internal-state
+ invsfunc( internal-state, data-values ) ---> internal-state-without-data-values
ffunc( internal-state ) ---> aggregate-value
*************** CREATE AGGREGATE stype
to hold the current internal state of the aggregate. At each input row,
the aggregate argument value(s) are calculated and
! the state transition function is invoked with the current state value
and the new argument value(s) to calculate a new
! internal state value. After all the rows have been processed,
! the final function is invoked once to calculate the aggregate's return
! value. If there is no final function then the ending state value
! is returned as-is.
--- 107,134 ----
of data type stype
to hold the current internal state of the aggregate. At each input row,
the aggregate argument value(s) are calculated and
! the forward state transition function is invoked with the current state value
and the new argument value(s) to calculate a new
! internal state value.
! If the aggregate is computed over a sliding frame, i.e. if it is used as a
! window function, the inverse transition function is
! used to undo the effect of a previous invocation of the forward transition
! function once argument value(s) fall out of the sliding frame.
! Conceptually, the forward transition functions thus adds some input
! value(s) to the state, and the inverse transition functions removes them
! again. Values are, if they are removed, always removed in the same order
! they were added, without gaps. Whenever the inverse transition function is
! invoked, it will thus receive the earliest added but not yet removed
! argument value(s). If no inverse transition function is supplied, the
! aggregate can still be used to aggregate over sliding frames, but with
! reduced efficiency. PostgreSQL will then
! recompute the whole aggregation whenever the start of the frame moves. To
! calculate the aggregate's return value, the final function is invoked on
! the ending state value. If there is no final function then ending state
! value is returned as-is. Either way, the result is assumed to reflect the
! aggregation of all values added but not yet removed from the state value.
! Note that if the aggregate is used as a window function, the aggregation
! may be continued after the final function has been called.
*************** CREATE AGGREGATE state_data_type
is the same as the first
arg_data_type.
--- 141,164 ----
! If the state transition functions are declared strict,
! then it cannot be called with null inputs. With such transition
! functions, aggregate execution behaves as follows. Rows with any null input
values are ignored (the function is not called and the previous state value
is retained). If the initial state value is null, then at the first row
with all-nonnull input values, the first argument value replaces the state
value, and the transition function is invoked at subsequent rows with
! all-nonnull input values. Should inputs later need to be removed again, the
! inverse transition function (if present) is used as long as some non-null
! inputs remain part of the state value. In particular, even if the initial
! state value is null, the inverse transition function might be used to remove
! the first non-null input, even though that input was never passed to the
! forward transition function, but instead just replaced the initial state!
! The last non-null input, however, is not removed by invoking the inverse
! transition function, but instead the state is simply reset to its initial
! value. This is handy for implementing aggregates like max.
! Note that turning the first non-null input into the initial state is only
! possible when
state_data_type
is the same as the first
arg_data_type.
*************** CREATE AGGREGATE sfunc
! The name of the state transition function to be called for each
input row. For a normal N>-argument
aggregate function, the sfunc>
must take N>+1 arguments,
--- 316,322 ----
sfunc
! The name of the (forward) state transition function to be called for each
input row. For a normal N>-argument
aggregate function, the sfunc>
must take N>+1 arguments,
*************** SELECT col FROM tab ORDER BY col USING s
*** 281,287 ****
The function must return a value of type state_data_type. This function
takes the current state value and the current input data value(s),
! and returns the next state value.
--- 326,334 ----
The function must return a value of type state_data_type. This function
takes the current state value and the current input data value(s),
! and returns the next state value. Note that if an inverse
! transition function is present, the forward transition function must
! not return NULL>
*************** SELECT col FROM tab ORDER BY col USING s
*** 294,299 ****
--- 341,368 ----
+ invsfunc
+
+
+ The name of the inverse state transition function to be called for each
+ input row. For a normal N>-argument
+ aggregate function, the invsfunc>
+ must take N>+1 arguments,
+ the first being of type state_data_type and the rest
+ matching the declared input data type(s) of the aggregate.
+ The function must return a value of type state_data_type. These are the same
+ demands placed on the forward transition function, meaning that the
+ signatures of the two functions, including their
+ strictness, must be identical. The inverse transition
+ function may return NULL> to force the aggregation to be
+ restarted from scratch.
+
+
+
+
+ state_data_type
diff --git a/doc/src/sgml/xaggr.sgml b/doc/src/sgml/xaggr.sgml
index e77ef12..0c25b6f 100644
*** a/doc/src/sgml/xaggr.sgml
--- b/doc/src/sgml/xaggr.sgml
***************
*** 16,22 ****
as each successive input row is processed.
To define a new aggregate
function, one selects a data type for the state value,
! an initial value for the state, and a state transition
function. The state transition function takes the previous state
value and the aggregate's input value(s) for the current row, and
returns a new state value.
--- 16,22 ----
as each successive input row is processed.
To define a new aggregate
function, one selects a data type for the state value,
! an initial value for the state, and a forward state transition
function. The state transition function takes the previous state
value and the aggregate's input value(s) for the current row, and
returns a new state value.
***************
*** 24,30 ****
can also be specified, in case the desired result of the aggregate
is different from the data that needs to be kept in the running
state value. The final function takes the last state value
! and returns whatever is wanted as the aggregate result.
In principle, the transition and final functions are just ordinary
functions that could also be used outside the context of the
aggregate. (In practice, it's often helpful for performance reasons
--- 24,42 ----
can also be specified, in case the desired result of the aggregate
is different from the data that needs to be kept in the running
state value. The final function takes the last state value
! and returns whatever is wanted as the aggregate result.
! To enable efficient evaluation of an aggregate used as a window function
! with a sliding frame (i.e. a frame that starts relative to the current row),
! an aggregate can optionally provide an inverse state transition function.
! The inverse transition function takes the the current state and the
! aggregate's input value(s) for the earliest row passed
! to the forward transition function, and returns a state equivalent to what
! the current state had been had the forward transition function never been
! invoked for that earliest row, only for all rows that followed it. Thus,
! if an inverse transition function is provided, the rows that were part of
! the previous row's frame but not of the current row's frame can simply be
! removed from the state instead of having to redo the whole aggregation
! over the new frame.
In principle, the transition and final functions are just ordinary
functions that could also be used outside the context of the
aggregate. (In practice, it's often helpful for performance reasons
*************** CREATE AGGREGATE avg (float8)
*** 132,137 ****
--- 144,188 ----
+ When providing an inverse transition function, care should be taken to
+ ensure that it doesn't introduce unexpected user-visible differences
+ between results obtained by reaggregating all inputs vs. using the inverse
+ transition function. An example for an aggregate where adding an inverse
+ transition function seems easy at first, yet were doing so would violate
+ this requirement is sum> over float> or
+ double precision>. A naive declaration of
+ sum(float>) could be
+
+
+ CREATE AGGREGATE unsafe_sum (float8)
+ (
+ stype = float8,
+ sfunc = float8pl,
+ invsfunc = float8mi
+ );
+
+
+ This aggregate, howevery, can give wildly different results than it would
+ have without the inverse transition function. For example, consider
+
+
+ SELECT
+ unsafe_sum(x) OVER (ORDER BY n ROWS BETWEEN CURRENT ROW AND
+ 1 FOLLOWING)
+ FROM (VALUES
+ (1, 1.0e20::float8),
+ (2, 1.0::float8)
+ ) AS v (n,x)
+
+
+ which returns 0 as it's second result, yet the expected answer is 1. The
+ reason for this is the limited precision of floating point types - adding
+ 1 to 1e20 actually leaves the value unchanged, and so substracting 1e20
+ again yields 0, not 1. Note that this is a limitation of floating point
+ types in general and not a limitation of PostgreSQL>.
+
+
+
Aggregate functions can use polymorphic
state transition functions or final functions, so that the same functions
can be used to implement multiple aggregates.