Re: [PATCH] Negative Transition Aggregate Functions (WIP)

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Date: 2014-01-20 07:42:54
Message-ID: CAApHDvoKQfd+e8VQRsx0_3ZW==HV0KNfjWXVvwaeCZ76zzS6Ag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 20, 2014 at 2:45 PM, Florian Pflug <fgp(at)phlo(dot)org> wrote:

> On Jan19, 2014, at 20:00 , David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > I've applied that patch again and put in the sort operators.
>
> I've push a new version to https://github.com/fgp/postgres/tree/invtrans
> which includes
>
> * A bunch of missing declaration for *_inv functions
>
>
Thanks, I've applied that.

> * An assert that the frame end doesn't move backwards - I realized that
> it is after all easy to do that, if it's done after the loop which adds
> the new values, not before.
>
>
I've applied this too, but I'm wondering why an elog for if the head moves
back, but an assert if the tail moves back?

> * EXPLAIN VERBOSE ANALYZE now shows the max. number of forward aggregate
> transitions per row and aggregate. It's a bit imprecise, because it
> doesn't
> track the count per aggregate, but it's still a good metric for how well
> the inverse transition functions work. If the number is close to one, you
> know that very few rescans are happening.
>
>
I've not looked at this yet and I don't think I'll have time tonight, but
it sounds interesting. I guess it might be quite nice to have a way to see
this especially with the way the numeric stuff works, it might be actually
pretty hard to otherwise know how many inverse transition "failures" there
had been. Do you think it's also worth tracking the inverse transition
failures too?

* I've also renamed INVFUNC to INVSFUNC. That's a pretty invasive change,
> and
> it's the last commit, so if you object to that, then you can merge up to
> eafa72330f23f7c970019156fcc26b18dd55be27 instead of
> de3d9148be9732c4870b76af96c309eaf1d613d7.
>
>
Seems like sfunc really should be tfunc then we could have invtfunc. I'd
probably understand this better if I knew what the 's' was for in sfunc.
I've not applied this just yet. Do you have a reason why you think it's
better?

> A few more things I noticed, all minor stuff
>
> * do_numeric_discard()'s inverseTransValid flag is unnecessary. First, if
> the
> inverse transition function returns NULL once, we never call it again,
> so the
> flag won't have any practical effect. And second, assume we ever called
> the
> forward transition function after the inverse fail, and then retried the
> inverse.
> In the case of do_numeric_discard(), that actually *could* allow the
> inverse
> to suddenly succeed - if the call to the forward function increased the
> dscale
> beyond that of the element we tried to remove, removal would suddenly be
> possible again. We never do that, of course, and it seems unlikely we
> ever
> will. But it's still weird to have code which serves no other purpose
> than to
> pessimize a case which would otherwise just work fine.
>
>
hmm, yeah of course, you are right. I've removed this now.

> * The state == NULL checks in all the strict inverse transition functions
> are
> redundant.
>
>
ok, I've removed these and added comments to note that these functions
should be declared strict.

> I haven't taken a close look at the documentation yet, I hope to be able to
> do that tomorrow. Otherwise, things look good as far as I'm concerned.
>
>
Thanks, yeah those really do need a review. I've lost a bit of direction
with them and I'm not quite sure just how much detail to go in to with it.
I'd like to explain a bit that users who need to use their numeric columns
in window aggregates might want to think about having a defined scale to
the numeric rather than an undefined scale and explain that this is because
the inverse transition function for numeric bails out if it loses the
maximum seen dscale. Though it does seem generally a good idea to have a
defined scale, but then I guess you've got to have a bit of knowledge about
the numbers you're storing in that case. I'm not quite sure how to put that
into words friendly enough for the documents just yet and or exactly where
to put the words. So any ideas or patches you have around that would be
great.

Once again thanks for all your work on this.

Regards

David Rowley

> best regards,
> Florian Pflug
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sameer Kumar 2014-01-20 07:46:48 using rpmbuild with PostgreSQL 9.2.6 source code
Previous Message Amit Khandekar 2014-01-20 07:11:07 Re: Proposal: variant of regclass