Re: [PATCH] Negative Transition Aggregate Functions (WIP)

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Date: 2013-12-17 07:01:40
Message-ID: CAApHDvq6SOD15-jw5=PcQTe+HHOC5uQEd-4880irxrSeOKk9wQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 17, 2013 at 11:06 AM, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> On Tue, Dec 17, 2013 at 1:18 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> Once again: this patch has no business changing any user-visible behavior.
>> That would include not changing the number of evaluations of volatile
>> functions. The planner is full of places where optimizations are disabled
>> for volatile subexpressions, and I don't see why this should be different.
>>
>>
> My point was meant to be more along the lines of that I thought it was
> already broken and it perhaps should be fixed or at the very least we could
> warn the users about it.
> I would imagine that most of those other places in the planner are to
> prevent extra evaluations of volatile functions? In this particular case
> we're already evaluating these multiple extra times when a tuple moves of
> the top of the frame. I would have thought that we should only evaluate the
> volatile function once per tuple. This is not what the current
> implementation does.
>
> I don't have an issue skipping this optimisation when the aggregate's
> expression contain any volatile functions. I just wanted to raise my
> concerns about the current behaviour, which I find a bit bizarre.
>
>
>
To improve on the example I used to try and get my point across:

These were all run on an unpatched copy of HEAD. Ignore the actual results
from sum() and look at what currval() is set to after each query.

create sequence seq;
select sum(nextval('seq')) over (order by n rows between current row and
unbounded following)
from generate_series(1,10) n(n);
select currval('seq');

drop sequence seq;
create sequence seq;
select sum(nextval('seq')) over (order by n)
from generate_series(1,10) n(n);
select currval('seq');

nextval() is executed 55 times with the first query and 10 times with the
2nd query. Of course this is because the current implementation requires
that when a tuple moves out of scope that the whole frame be re-aggregated.
I had thought that all of the places that disabled optimisations due to
there being volatile somewhere were to stop the number of executions being
undefined, this case seems undefined already, or at least I can't find
anywhere in the docs that says the expression will be executed this number
of times.

Once again, I'm not fighting to have inverse transitions uses when volatile
functions are involved, I'll happily disable that. I just wanted to raise
this to find out if it's intended or not and it seemed like a good thread
to do it on.

Regards

David Rowley

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2013-12-17 07:56:09 Re: Race condition in b-tree page deletion
Previous Message Amit Kapila 2013-12-17 06:17:16 Re: Heavily modified big table bloat even in auto vacuum is running