Re: [PATCH] Negative Transition Aggregate Functions (WIP)

From: Florian Pflug <fgp(at)phlo(dot)org>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Date: 2014-01-17 02:05:30
Message-ID: 5E529086-3F86-4FD4-82A6-3775AD99C72B@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I had some more fun with this, the result is v2.5 of the patch (attached). Changes are explained below.

On Jan16, 2014, at 19:10 , Florian Pflug <fgp(at)phlo(dot)org> wrote:
> On Jan16, 2014, at 09:07 , David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>> On Wed, Jan 15, 2014 at 5:39 AM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
>>> The notnullcount machinery seems to apply to both STRICT and non-STRICT transfer function pairs. Shouldn't that be constrained to STRICT transfer function pairs? For non-STRICT pairs, it's up to the transfer functions to deal with NULL inputs however they please, no?
>>
>> The reason I had to track the notnullcount was because for non-strict was because:
>>
>> select sum(n) over (order by i rows between current row and unbounded following) from (values(1,1),(2,NULL)) v(i,n);
>>
>> would otherwise return
>> 1
>> 0
>>
>> sum is not strict, so I guess we need to track notnullcount for non-strict functions.
>> See the code around if (peraggstate->notnullcount == 0) in retreat_windowaggregate().
>
> Yeah, I figured that was the reason, but you can't fix it that way. See http://www.postgresql.org/message-id/8E857D95-CBA4-4974-A238-9DD7F61BEA48@phlo.org for a detailed explanation why. My 2.2 patch allows pairs of non-strict forward and strict inverse transfer functions exactly because of this - i.e., basically it decrees that if there's an inverse transfer function, the strict setting of the *inverse* function determines the aggregates NULL behaviour. The forward transfer function is then never called
> for NULL inputs, but it *is* called with the NULL state for the first non-NULL input, and *must* then return a non-NULL state (hence it's technically not strict, it's strict only in the inputs, not in the state).
>
> BTW, I just realized I failed to update CREATE AGGREGATE's logic when I did that in 2.2. That doesn't matter for the built-in aggregates since these aren't created with CREATE AGGREGATE, but it's obviously wrong. I'll post a fixed patched.

Fixed now.

>>> The logic around movedaggbase in eval_windowaggregates() seems a bit convoluted. Couldn't the if be moved before the code that pulls aggregatedbase up to frameheadpos using the inverse aggregation function?
>
>> I had a look at this and even tried moving the code to before the inverse transitions, but it looks like that would only work if I added more tests to the if condition to ensure that we're not about to perform inverse transitions. To me it just seemed more bullet proof the way it is rather than duplicating logic and having to ensure that it stays duplicated. But saying that I don't think I've fully got my head around why the original code is valid in the first place. I would have imagined that it should contain a check for FRAMEOPTION_START_UNBOUNDED_FOLLOWING, but if that sounds like complete rubbish then I'll put it down to my head still being fried from work.
>
> Ok, I get this now. That code really just is asking "is the previous row's frame the same as the current row's frame" in that "if" where you added movedagg. What confused me was the rather weird way that condition is expressed, which turns out is due to the fact that at the point of the if, we don't know the new frame's end. Now, we could use update_frametailpos() to find that, but that's potentially costly, especially if the tuple store was spilled to disk. So instead, the code relies on the fact that only if the frame end is "n FOLLOWING/PRECEDING" can the current row lie within the previous row's frame without the two frame's ends being necessarily the same.
>
> For added confusion, that "if" never explicitly checks whether the frame's *heads* coincide - the previous code seems to have relied on the impossibility of having "aggregatedbase <= current < aggregatedupto" after re-initializing the aggregation, because then aggregatedbase = aggregatedupto = 0. That's why you can't just move the "if" before the "frameheadpos == aggregatedbase" check. But you can *if* you also check whether "aggregatedbase == frameheadpos" in the if - which is clearer than relying on that rather subtle assumption anyway.
>
> BTW, the your patch will also fail badly if the frame head ever moves backwards - the invariant that "aggregatedbase == frameheadpos" after the inverse transition loop will then be violated. Now, this should never happen, because we require that the offset in "n PRECEDING/FOLLOWING" is constant, but we should probably still check for this and elog().
>
> That check was implicit in old code, because it advanced the tuplestore mark, so if the frame head moved backwards, re-scanning the frame would have failed. That brings me to another todo - as it stands, that mark gets never advanced if we're doing inverse aggregation. To fix that, we need a call to WinSetMarkPosition() somewhere in eval_windowaggregates().
>
> After doing this things, eval_windowaggregates ended up calling initalize_windowframeaggregates at a single place again, so I inlined it back into eval_windowaggregates(). I also made it use temp_slot_1 in the inverse aggregation loop, since that seemed safe - the code assumes some invariants about aggregatedbase and agg_row_slow.
>
> Updated patch is attached (2.4_fgp).

I've now shuffled things around so that we can use inverse transition functions
even if only some aggregates provide them, and to allow inverse transition
functions to force a restart by returning NULL. The rules regarding NULL handling
are now the following

(A) Transition functions without an inverse behave as they always did
(B) Forward transition functions with an inverse may not return NULL,
not even for all-NULL inputs.
(C) If the inverse transition function returns NULL, the aggregation is
restarted from the beginning, just as If no inverse transition
function had been provided.
(D) If the transition function is strict, the inverse must also be.
(E) NULLs are skipped if the inverse transition function is strict,
regardless of whether the forward transition function is.

The rules may seem a bit weird, but they're the only solution I found which

* Doesn't require every invertible transition function with "internal"
state to implement it's own NULL counting. This requires (E), since
transition functions with state type "internal" need to be callable
with a NULL state - since nobody else can allocate the state object.

* Still allows transition functions to handle NULL values however they
please - they just have to have a non-strict inverse to opt out of (E).
They cannot use NULL states however they please, though.

* Allows partial inverse transition functions, i.e. inverse transition
functions which sometimes report "sorry, please restart". The only
easy way to report this is to have the inverse transition function
return NULL, which means we cannot allow NULL as just another state
value, i.e. this mandates (B,C)

(D) is purely to avoid having to deal with any special-cases that might
arise from allowing this.

I've also moved the volatile functions check into initialize_peragg() -
having it in ExecWindowAgg() made no sense to me since it's dealing
with static information. Also, ExecWindowAgg() may get called multiple
times, e.g. if the window functions are on the inner side of a
nested loop join or within a subselect.

>>> Also, could we easily check whether the frames corresponding to the individual rows are all either identical or disjoint, and don't use the inverse transfer function then? Currently, for a frame which contains either just the current row, or all the current row's peers, I think we'd use the inverse transfer function to fully un-add the old frame, and then add back the new frame.
>>
>> I didn't know there was a situation where this could happen. Could you give me an example query and I'll run it through the debugger to have a look at what's going on. But sure, if this is possible and I understand what you mean then I guess it would be a good optimisation to detect this and throw away the previous results and start fresh.
>
> The case I had in mind there was "RANGE BETWEEN CURRENT ROW AND CURRENT ROW", i.e. each row's frame consists of all it's ordering peers. Haven't looked into this yet.

It now resets the aggregation when the new frame doesn't overlap the old one, i.e. if frameheadpos > aggregatedupto.

The patch passes regression test, but I haven't done any other testing yet.
Also, the documentation doesn't explain the NULL and STRICT semantics yet.

best regards,
Florian Pflug

Attachment Content-Type Size
inverse_transition_functions_v2.5_fgp.patch.gz application/x-gzip 22.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2014-01-17 02:08:59 Re: new json funcs
Previous Message Robert Haas 2014-01-17 01:48:24 Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance