Re: [PATCH] Negative Transition Aggregate Functions (WIP)

From: Florian Pflug <fgp(at)phlo(dot)org>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Date: 2014-01-23 02:27:03
Message-ID: 3C63AF01-7D8C-4243-8942-78AE84B8F3B4@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jan23, 2014, at 01:07 , David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> On Tue, Jan 21, 2014 at 3:20 AM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
>> On Jan20, 2014, at 08:42 , David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>> >> On Mon, Jan 20, 2014 at 2:45 PM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
>> >> * I've also renamed INVFUNC to INVSFUNC. That's a pretty invasive change, and
>> >> it's the last commit, so if you object to that, then you can merge up to
>> >> eafa72330f23f7c970019156fcc26b18dd55be27 instead of
>> >> de3d9148be9732c4870b76af96c309eaf1d613d7.
>> >
>> >
>> > Seems like sfunc really should be tfunc then we could have invtfunc. I'd probably
>> > understand this better if I knew what the 's' was for in sfunc. I've not applied
>> > this just yet. Do you have a reason why you think it's better?
>>
>> My issue with just "invfunc" is mainly that it's too generic - it doesn't tell
>> you what it's supposed to be the inverse of.
>>
>> I've always assumed that 's' in 'sfunc' and 'stype' stands for 'state', and that
>> the naming is inspired by control theory, where the function which acts on the
>> state space is often called S.
>
> Ok, that makes more sense now and it seems like a reasonable idea. I'm not not quite
> sure yet as when someone said upthread that these "negative transition functions" as
> I was calling them at the time should really be called "inverse transition functions",
> I then posted that I was going to call the create aggregate option "invfunc" which
> nobody seemed to object to. I just don't want to go and change that now. It is very
> possible this will come up again when the committer is looking at the patch. It would
> be a waste if it ended up back at invfunc after we changed it to invsfunc.

Since we already settled on "inverse transition function", I kinda doubt that
calling the parameter invsfunc is going to meet a lot of resistance. But we can put
that off a little longer still...

I've pushed a few additional things to https://github.com/fgp/postgres/tree/invtrans.

* I update the CREATE AGGREGATE documentation, trying to include the description of
the various modes of inverse transition functions into the paragraphs which already
explained about STRICT for transition functions and such.

* I've also updated the list of window functions to include a list of those
aggregates which potentially need to restart the computation, i.e. MIN/MAX and
the like.

* I've changed nodeWindowAgg.c to use per-aggregate aggregation contexts for the
invertible aggregates. Without that, the aggregate context is potentially never
reset, because that previously required *all* the aggregates to restart at the
same time. That would be OK if we were sure not to leak unbounded amounts of
stuff stores in that context, but unfortunately we sometimes do. For example,
whenever a strict, invertible aggregate ends up with only NULL inputs, we
re-initialize the aggregation, which leaks the old state value. We could
pfree() that of course, but that state value might reference other stuff that
we don't know about and thus cannot free. Separating the aggregation contexts
is the only solution I came up with, so I did that.

* I've also tweaked an if to flag aggregates as invertible only if the frame head
can actually move, i.e. if the frame start clause is something other than
UNBOUNDED PRECEEDING. Since the choice of whether to use a private aggregation
context is driven by that flag, that also makes the above apply only to aggregates
were the inverse transition function is actually used.

I hope to find some time tomorrow or so to complete my pass through the documentation -
what's still missing as an explanation of the EXPLAIN VERBOSE ANALYZE field and maybe
some cleanup of xaggr.sgml.

Do you have any additional things pending?

best regards,
Florian Pflug

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2014-01-23 03:00:06 Re: Add min and max execute statement time in pg_stat_statement
Previous Message Florian Pflug 2014-01-23 02:09:48 Confusing documentation of ordered-set aggregates?