Re: [PATCH] Negative Transition Aggregate Functions (WIP)

From: Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Date: 2014-01-02 12:59:09
Message-ID: CAP-rdTbNjqndHJe7Vh-ZFJ+Zi1dpRQUYOm0YeRoSzVFTXQ9MgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2013/12/15 David Rowley <dgrowleyml(at)gmail(dot)com>:

> I've been working on speeding up aggregate functions when used in the
> context of a window's with non fixed frame heads.

> 1. Fully implement negative transition functions for SUM and AVG.

I would like to mention that this functionality is also extremely
useful to have for the incremental maintenance of materialized views
that use aggregation (which IMHO is one of the most useful kinds).

(Simply imagine a view of the form “SELECT a, agg_function(b) FROM
table GROUP BY a”, a lot of rows in the table, a lot of rows in each
group, and changes that both remove and add new rows.)

For this to work, two things are needed:

(1) A way to apply a value normally (already supported) and inversely
(i.e., this patch) to the current “internal state” of an aggregation.

(2) A way to store the “internal state” of an aggregation in the
materialized view’s “extent” (i.e., the physical rows that represent
the view’s contents, which may or may not be slightly different from
what you get when you do SELECT * FROM matview). As (AFAIK) that state
is stored as a normal value, the maintenance code could just take the
value, store it in the extent, and next time retrieve it again and
perform normal or inverse transitions. When selecting from the
matview, the state could be retrieved, and the final function applied
to it to yield the value to be returned.

To understand (2), assume that one wants to store an AVG() in a
materialized view; To be able to update the value incrementally, one
needs to actually store the SUM() and COUNT(), and perform the
division when selecting from the materialized view. Or it could
(initially) be decided to define AVG() as “not supporting fast
incremental maintenance,” and require the user (if he/she wants fast
incremental maintenance) to put SUM() and COUNT() in the materialized
view manually, and perform the division manually when wanting to
retrieve the average.

Nicolas

--
A. Because it breaks the logical sequence of discussion.
Q. Why is top posting bad?

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-01-02 12:59:18 Re: preserving forensic information when we freeze
Previous Message Robert Haas 2014-01-02 12:53:45 Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE