Re: [PATCH] Negative Transition Aggregate Functions (WIP)

From: Florian Pflug <fgp(at)phlo(dot)org>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Date: 2014-03-03 23:00:43
Message-ID: 64F96FD9-64D1-40B9-8861-E6182029220B@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mar2, 2014, at 20:39 , Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
> * In a couple of places:
>
> errmsg("stricness of forward and reverse transition functions must match")
>
> - misspelling: "stricness".
> - "reverse" should be "inverse" to match the terminology used elsewhere.

Done.

> * Grammatical error in the comment for lookup_agg_function() - you
> should drop the word "both".

Done.

> * In show_windowagg_info(), this calculation looks suspicious to me:
>
> double tperrow = winaggstate->aggfwdtrans /
> (inst->nloops * inst->ntuples);
>
> If the node is executed multiple times, aggfwdtrans will be reset in
> each loop, so the transitions per row figure will be under-estimated.
> ISTM that if you want to report on this, you'd need aggfwdtrans to be
> reset once per query, but I'm not sure exactly how to do that.
>
> ...
>
> Actually, I think it's misleading to only count forward transition
> function calls, because a call to the inverse transition function
> still represents a state transition, and is likely to be around the
> same cost. For a window of size 2, there would not be much advantage
> to using inverse transition functions, because it would be around 2
> transitions per row either way.

True. In fact, I pondered whether to avoid using the inverse transition
function for windows of 2 rows. In the end, I didn't because I felt that
it makes custom aggregates harder to test.

On the question of whether to count inverse transition function calls -
the idea of the EXPLAIN VERBOSE ANALYZE output isn't really to show the
number of state transitions, but rather to show whether the aggregation
has O(n) or O(n^2) behaviour. The idea being that a value close to "1"
means "inverse transition function works as expected", and larger values
mean "not working so well".

Regarding multiple evaluations - I think I based the behaviour on how
ntuples works, which also only reports the value of the last evaluation
I think. But maybe I'm confused about this.

> * The function comment for build_aggregate_fnexprs() needs to be
> updated to reference the inverse transition function. I'd also be
> tempted to have it allow invtransfnexpr be a NULL pointer, if the
> inverse transition function expression tree is not required. Then
> ExecInitAgg() could simply pass NULL, instead of having the local
> variable invtransfnexpr with the slightly cryptic comment "needed but
> never used".

Hm, I suggested that to David Rowley initially, and he wasn't particularly
convinced - after all, there aren't *that* many callers of that function,
and there isn't much overhead - we just set the value to NULL if an invalid
OID is passed. But now that you brought it up too, it's two yea's agains
one nay, so I changed it.

> * In struct WindowStatePerAggData, I think you should change the field
> order to transfn_oid, invtransfn_oid and then finalfn_oid. It's only a
> small thing, but that's the order those 3 functions are referred to
> everywhere else.

Done.

> * In struct WindowStatePerAggData, the comment for transValueCount
> should read "number of aggregated values".

Done.

> * If AggCheckCallContext() is called from a window function, and it
> asks for an aggcontext, it will fail because calledaggno will be -1.
> That can't currently happen for any of our built-in window functions,
> and I'm not sure if it's likely to happen in the future, but I think
> it would be better to defend against that possibility just in case. So
> I think it ought to return the shared context in that case, as the
> original code would have done.

I did it this way on purpose, because it was never actually safe to
use the shared content from window functions - they won't know if and
when that context is reset. I added a note to the function comment
explaining that this function isn't meant to be used by true window
functions

> * In advance_windowaggregate(), this code
>
> if (peraggstate->transfn.fn_strict) {
>
> is against the project style, which is to have curly braces on new
> lines. But also, that test condition is the same as the preceding
> block, so the 2 blocks could just be merged.

Done. The two conditions used to be different, I think, but requiring
the strictness settings to match made them the same.

> * I was wondering about the case of a forward transition function
> returning NULL in the presence of an inverse transition function. In
> this patch there are 3 pieces of code that test for that:
>
> 1). advance_windowaggregate() errors out if the forward transition
> function returns NULL and there is an inverse transition function.
>
> 2). an Assert in advance_windowaggregate() fires if a prior call made
> the state NULL and there is an inverse transition function (should be
> impossible due to the above error).

Yeah, that assert is just there to document that "yes, this cannot be".

> 3). retreat_windowaggregate() errors out if it sees a NULL state
> (which ought to be impossible due to both of the above).

This too is simply meant to document "No worries, cannot be".

> I find the resulting error "transition function with an inverse
> returned NULL" surprising. Why shouldn't a transition function return
> NULL if it wants to? It can if it's used as a regular aggregate, so it
> seems somewhat odd that it can't if it's used in a window context, and
> it has an inverse.

Because transition function cannot use NULL as a state value freely if
they have an inverse, since if the inverse returns NULL that means
"sorry, cannot invert, please restart aggregation".

> Would it not be simpler and more flexible to just allow the forward
> transition function to return NULL, and treat a NULL state as
> non-invertible, requiring a restart. So delete check (1) and just
> allow the forward transition function to return NULL, delete Assert
> (2) so that it propagates a NULL state to the end as it would do in
> the absence of an inverse transition function, and modify check (3) to
> return false forcing a restart if the state is NULL. So then if the
> forward transition function did return NULL, the inverse transition
> function would not actually be called, and it would compute the answer
> the hard way, rather than erroring out.

Though that would be weird too - if the inverse transition function
is non-strict, we actually can *call* it with a NULL state, it just
cannot really return a NULL state (well it can, but that means
"restart the aggregation, please").

I don't think there's anything to be gained by allowing this, so I'd
prefer if we don't.

> * In retreat_windowaggregate(), the comment before the check on
> transValueCount is missing a "be".

Done.

> * I think the function comment for eval_windowaggregates() should be
> updated to mention that it may also use an inverse transition function
> to remove aggregated data from a transition value.

Done.

> * I found the guts of eval_windowaggregates() a little hard to follow,
> although I think the code is correct. It could perhaps use a little
> tidying up. Here are a few ideas:
>
> - Maybe numaggs_restart would be better called numtorestart.

Hm, I'm not particularly fond of that - it doesn't really emphasize the
connection to numaggs.

> - is_first isn't adding much, and it would probably be easier to read
> if it were eliminated by inlining it in the one place it is used.

Done.

>
> - Similarly I would move the variable "ok" to the block that uses it.

Done.

>
> - The variable pos could be eliminated by having the retreat loop
> increment winstate->aggregatedbase in each pass, which would better
> match its comment which says it updates aggregatedbase, which it
> currently doesn't do. For consistency, perhaps this loop should be
> written more in the style of the advance loop. The loop should ensure
> that, on exit, aggregatedbase is equal to frameheadpos.

Done, mostly. The loops still look a bit different, but that's mostly
because of the tuple-slot optimization in the advance loop (it re-uses
the last row it fetched during the *previous* call to eval_windowaggregates
if possible), and because the retreat loop already knows how far to
retreat, while the advance loop figures that out as it goes.

> - aggregatedupto_nonrestarted is a bit of a mouthful, but I can't
> immediately think of a better name. However, there's one comment that
> refers to it as aggregatedupto_previous.

I don't have a better name either, so I kept the current name.

> - The comment starting "After this, non-restarted aggregated..." is a
> bit confusing. What the code following it really seems to be doing is
> more along the lines of "If there are aggregates to restart, rewind
> aggregatedupto back to frameheadpos so that we can re-aggregate those
> values in the aggregates to be restarted...".

I re-wrote that comment, and also moved the block that updates the
tuplestore mark up above the restart-block. Thus, the block that updates
aggregatedupto is now immediately before the advance loop, which hopefully
makes the connection more obvious.

> * The pg_dump modifications don't look correct to me. I didn't test
> it, but I think you need to modify the pre-9.4 queries to return "-"
> for agginvtransfn otherwise I think it will be NULL and will seg-fault
> if you dump a pre-9.4 database containing custom aggregates.

Hm, I haven't touched this code, it comes from David's original patch.
I agree that it looks wrong - I've updated it per your suggestion, and
checked that it dumps 9.3 and HEAD+patch correctly.

> * The last regression test in aggregates.sql should test for unequal
> strictness values, rather than just strict forward transition
> functions with non-strict inverses.

Done. That was a leftover from when only one case was rejected.

Attached a new versions of all 5 patches. Only the base patch has actually
changed, the others are just included for completeness' sake.

best regards,
Florian Pflug

Attachment Content-Type Size
invtrans_strictstrict_arith_76031b.patch application/octet-stream 66.9 KB
invtrans_strictstrict_base_038070.patch application/octet-stream 105.2 KB
invtrans_strictstrict_collecting_1a1ca1d.patch application/octet-stream 32.8 KB
invtrans_strictstrict_docs_0cb944.patch application/octet-stream 22.1 KB
invtrans_strictstrict_minmax_237683.patch application/octet-stream 70.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kouhei Kaigai 2014-03-03 23:25:40 Re: Custom Scan APIs (Re: Custom Plan node)
Previous Message Andres Freund 2014-03-03 22:43:25 Re: Changeset Extraction v7.9