Re: Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support

Lists: pgsql-hackers
From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support
Date: 2009-12-20 20:52:24
Message-ID: 162867790912201252q77ee2588me4c4bf76b5919f1c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

I am thinking about implementation of median function. This function
should be implemented in two ways:

a) direct entering an ORDER BY clause for median funcall in gram.y
b) general support for "preordered aggregates".

I prefer plan b, because there are more similar aggregates - like
Quantiles. So with general support of preordered aggregates we can
move these aggregates to contrib, separate library or core (if we
would). I am for median in core - and others in contrib - as example
of this kind of aggregates.

What we need:

a) some new intelligence in parser - it use default ORDER BY clause
when explicit ORDER BY clause is missing.

b) for effective implementation we need to know real number of tuples
in state function. Without this knowledge we have to copy tuples to
tuple store or to array. It is useless, because we have to execute
sortby before - so we have this information. There could be one
optimalisation. We don't need to process all rows - for median we have
to process only 1/2 of rows - it could be nice, if state function can
send signal - don't call me more.

New syntax:

CREATE AGGREGATE name ( input_data_type [ORDER BY [(DESC|ASC)]] [ , ... ] ) (
SFUNC = sfunc,
STYPE = state_data_type
[ , FINALFUNC = ffunc ]
[ , INITCOND = initial_condition ]
[ , SORTOP = sort_operator ]
)

example:

CREATE AGGREGATE median (float8 ORDER BY) (
SFUNC = median_state,
STYPE = internal,
FINALFUNC = median_final,
SORTOP = '<'
)

I would to insure, so this idea is acceptable.

Regards
Pavel Stehule


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support
Date: 2009-12-20 21:08:37
Message-ID: 4876.1261343317@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> I am thinking about implementation of median function. This function
> should be implemented in two ways:

> a) direct entering an ORDER BY clause for median funcall in gram.y
> b) general support for "preordered aggregates".

> I prefer plan b, because there are more similar aggregates - like
> Quantiles.

This seems like a great deal of mechanism to solve a very localized
problem.

I think that we've already expanded the capabilities of aggregates
a great deal for 8.5, and we should let it sit as-is for a release
or two and see what the real user demand is for additional features.

I'm particularly concerned by the fact that the feature set is already
far out in front of what the planner can optimize effectively (e.g.,
there's no ability to combine the work when multiple aggregates need the
same sorted data). The more features we add on speculation, the harder
it's going to be to close that gap.

Another risk is that features added now might preclude adding others
later.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support
Date: 2009-12-20 21:21:01
Message-ID: 162867790912201321v50da73c4ta169aedd62970b7f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/12/20 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> I am thinking about implementation of median function. This function
>> should be implemented in two ways:
>
>> a) direct entering an ORDER BY clause for median funcall in gram.y
>> b) general support for "preordered aggregates".
>
>> I prefer plan b, because there are more similar aggregates - like
>> Quantiles.
>
> This seems like a great deal of mechanism to solve a very localized
> problem.
>

plan a doesn't block plan b - it is very simple. So we can start with a.

> I think that we've already expanded the capabilities of aggregates
> a great deal for 8.5, and we should let it sit as-is for a release
> or two and see what the real user demand is for additional features.
>
> I'm particularly concerned by the fact that the feature set is already
> far out in front of what the planner can optimize effectively (e.g.,
> there's no ability to combine the work when multiple aggregates need the
> same sorted data).  The more features we add on speculation, the harder
> it's going to be to close that gap.

I didn't thing about this optimalisation, but this point could not be
impossible. Bigger problem is using of indexes.

>
> Another risk is that features added now might preclude adding others
> later.
>

sure. It was my question, what is preferred.

Regards
Pavel

>                        regards, tom lane
>


From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support
Date: 2009-12-20 21:23:05
Message-ID: 52AFDB8E-124E-46A0-9B62-987C789077E9@hi-media.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi from a real user :)

Le 20 déc. 2009 à 22:08, Tom Lane a écrit :
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> b) general support for "preordered aggregates".
>
> I think that we've already expanded the capabilities of aggregates
> a great deal for 8.5, and we should let it sit as-is for a release
> or two and see what the real user demand is for additional features.

All we can have in PostgreSQL without needing to resort to either PLs or application code is worth it from here, and I can already picture the smiling on our developers face when I say them median() is there by default.

> I'm particularly concerned by the fact that the feature set is already
> far out in front of what the planner can optimize effectively (e.g.,
> there's no ability to combine the work when multiple aggregates need the
> same sorted data). The more features we add on speculation, the harder
> it's going to be to close that gap.
>
> Another risk is that features added now might preclude adding others
> later.

Now, I have no idea if augmenting the aggregate properties with an optional sorting step is the right approach, but it sounds right on spot (general enough without being over engineering). I guess it would give the planner the same information as if the user did type the extra order by himself, so I'm not sure how much your remarks would apply?

I mean we already have explicit user ordering in aggregates at call site, adding the exact same information in the aggregate definition itself surely isn't going to be such a change there?

Regards,
--
dim


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support
Date: 2009-12-20 21:39:52
Message-ID: 5358.1261345192@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dimitri Fontaine <dfontaine(at)hi-media(dot)com> writes:
> Le 20 dc. 2009 22:08, Tom Lane a crit :
>> Another risk is that features added now might preclude adding others
>> later.

> Now, I have no idea if augmenting the aggregate properties with an optional sorting step is the right approach, but it sounds right on spot (general enough without being over engineering). I guess it would give the planner the same information as if the user did type the extra order by himself, so I'm not sure how much your remarks would apply?

> I mean we already have explicit user ordering in aggregates at call site, adding the exact same information in the aggregate definition itself surely isn't going to be such a change there?

It's not obvious to me that this feature is better than others that
might apply in the same general area, eg the existing min/max
optimization. Since (to repeat myself) we have no field experience
with ordered aggregate inputs, assuming we know what the next extension
should be in this area seems a bit premature.

Also, you're conveniently ignoring my point about how optimization
should probably be the next area of focus here. Right now, there
is really little if any value in having this feature, because a
median aggregate could sort internally and not be any more or less
efficient than having nodeAgg do it. Once we have an optimization
framework there might be some advantage there, but we should not
complicate matters with second-order features before we have that.

regards, tom lane


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: pavel(dot)stehule(at)gmail(dot)com (Pavel Stehule), Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support
Date: 2009-12-20 22:48:33
Message-ID: 87zl5de1vc.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Pavel" == Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:

> 2009/12/20 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> I think that we've already expanded the capabilities of aggregates
>> a great deal for 8.5, and we should let it sit as-is for a release
>> or two and see what the real user demand is for additional
>> features.
>>
>> I'm particularly concerned by the fact that the feature set is
>> already far out in front of what the planner can optimize
>> effectively (e.g., there's no ability to combine the work when
>> multiple aggregates need the same sorted data).  The more features
>> we add on speculation, the harder it's going to be to close that
>> gap.

I absolutely agree with Tom here and for some quite specific reasons.

An optimal (or at least more optimal than is currently possible)
implementation of median() on top of the ordered-agg code as it stands
requires additions to the aggregate function interface: the median agg
implementation would have to, as a minimum, know how many rows of
sorted input are available. In addition, it would be desirable for it
to have direct (and possibly bidirectional) access to the tuplesort.

Now, if we look at how ordered aggs ought to be optimized, it's clear
that the planner should take the ordering costs into account and
consider plans that order the input instead. Once you do this, then
there's no longer any pre-computed count or tuplesort object
available, so if you'd implemented a better median() before the
optimizations, you'd end up either having to forgo the optimization or
break the median() agg again; clearly not something we want.

Plus, a feature that I intentionally omitted from the ordered-aggs
patch is the ability to do ordered-aggs as window functions. This is
in the spec, but (a) there were conflicting patches for window
functions on the table and (b) in my opinion, much of the work needed
to implement ordered-agg-as-window-func in an effective manner is
dependent on doing more work on optimization first (or at least will
potentially become easier as a result of that work).

So I think both the optimization issue and the window functions issue
would be best addressed before trying to build any additional features
on top of what we have so far.

--
Andrew (irc:RhodiumToad)


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support
Date: 2009-12-21 01:37:51
Message-ID: 407d949e0912201737m4c503878y8db53fbdbe317e6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Incidentally there are O(n) algorithms for finding the median (or any
specific offset). It shouldn't be necessary to sort at all.

I'm not sure which path this argues for - perhaps Tom's position that
we need more optimiser infrastructure so we can see how to accomplish
this. Perhaps it means you should really implement median() with an
internal selection alagorithm and not depend on the optimizer at all.

--
greg


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support
Date: 2009-12-21 05:01:35
Message-ID: 162867790912202101v411fb4b9if27a1bc7ab3447fa@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/12/21 Greg Stark <gsstark(at)mit(dot)edu>:
> Incidentally there are O(n) algorithms for finding the median (or any
> specific offset). It shouldn't be necessary to sort at all.

it is interesting information. It could to help with missing
optimalisations now.

Pavel

>
> I'm not sure which path this argues for - perhaps Tom's position that
> we need more optimiser infrastructure so we can see how to accomplish
> this. Perhaps it means you should really implement median() with an
> internal selection alagorithm and not depend on the optimizer at all.
>
>
> --
> greg
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support
Date: 2009-12-21 05:48:15
Message-ID: 162867790912202148j3381f9a2k52a3b697ab2201d8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/12/20 Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>:
>>>>>> "Pavel" == Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>
>  > 2009/12/20 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>  >> I think that we've already expanded the capabilities of aggregates
>  >> a great deal for 8.5, and we should let it sit as-is for a release
>  >> or two and see what the real user demand is for additional
>  >> features.
>  >>
>  >> I'm particularly concerned by the fact that the feature set is
>  >> already far out in front of what the planner can optimize
>  >> effectively (e.g., there's no ability to combine the work when
>  >> multiple aggregates need the same sorted data).  The more features
>  >> we add on speculation, the harder it's going to be to close that
>  >> gap.
>
> I absolutely agree with Tom here and for some quite specific reasons.
>
> An optimal (or at least more optimal than is currently possible)
> implementation of median() on top of the ordered-agg code as it stands
> requires additions to the aggregate function interface: the median agg
> implementation would have to, as a minimum, know how many rows of
> sorted input are available. In addition, it would be desirable for it
> to have direct (and possibly bidirectional) access to the tuplesort.

I was thinking about some new kind of aggregates based on tuple-store.

>
> Now, if we look at how ordered aggs ought to be optimized, it's clear
> that the planner should take the ordering costs into account and
> consider plans that order the input instead. Once you do this, then
> there's no longer any pre-computed count or tuplesort object
> available, so if you'd implemented a better median() before the
> optimizations, you'd end up either having to forgo the optimization or
> break the median() agg again; clearly not something we want.
>

Information about count of rows are not detected in planner time. This
have to by available in any executor's sort node. So this value will
be available every time - index scan is problem. Interesting is Greg's
info about O(N) algorithms. Then we can implement median as classic
aggregate.

> Plus, a feature that I intentionally omitted from the ordered-aggs
> patch is the ability to do ordered-aggs as window functions. This is
> in the spec, but (a) there were conflicting patches for window
> functions on the table and (b) in my opinion, much of the work needed
> to implement ordered-agg-as-window-func in an effective manner is
> dependent on doing more work on optimization first (or at least will
> potentially become easier as a result of that work).
>

I fully agree, so agg(x order by x) needs some work more - so general
design is premature.

On second hand - any internal implementation of median will be faster,
then currently used techniques.

Pavel

> So I think both the optimization issue and the window functions issue
> would be best addressed before trying to build any additional features
> on top of what we have so far.
>
> --
> Andrew (irc:RhodiumToad)
>


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support
Date: 2009-12-21 10:13:02
Message-ID: 407d949e0912210213u39b857b0qa842885d715079e6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 21, 2009 at 5:48 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> Information about count of rows are not detected in planner time. This
>  have to by available in any executor's sort node. So this value will
> be available every time - index scan is problem. Interesting is Greg's
> info about O(N) algorithms. Then we can implement median as classic
> aggregate.
>...
> On second hand - any internal implementation of median will be faster,
> then currently used techniques.

Some further information now that I'm on a real keyboard.

The O(n) algorithm for median of which I'm aware is Quickselect. It's
based on Quicksort which makes it unsuitable for a disk-based
algorithm since it would have to read every block many times. Perhaps
there's some way to adapt it or there might be some other O(n)
algorithm which has better i/o patterns.

But in cases where the tupleset/tuplesort is still in memory it would
be easy to add support for pulling out the nth element and use that in
a magic median() function. It wouldn't be a "classic" aggregate, at
least as I understand it where you also need something like O(1) space
to store the state, because you definitely need access to the whole
tupleset to do the quickselect.

If we don't find a way to optimize this for on-disk tuplestores though
then I wonder whether it's worth it. It would only help in the narrow
case that you had a large enough set to see the difference between
doing quickselect and quicksort but small enough to fit in workmem. I
suppose that case is widening over time though as memory sizes get
bigger and bigger.

--
greg


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support
Date: 2009-12-21 11:43:24
Message-ID: 162867790912210343r9ce2d53s51616392c3003042@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/12/21 Greg Stark <gsstark(at)mit(dot)edu>:
> On Mon, Dec 21, 2009 at 5:48 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> Information about count of rows are not detected in planner time. This
>>  have to by available in any executor's sort node. So this value will
>> be available every time - index scan is problem. Interesting is Greg's
>> info about O(N) algorithms. Then we can implement median as classic
>> aggregate.
>>...
>> On second hand - any internal implementation of median will be faster,
>> then currently used techniques.
>
> Some further information now that I'm on a real keyboard.
>
> The O(n) algorithm for median of which I'm aware is Quickselect. It's
> based on Quicksort which makes it unsuitable for a disk-based
> algorithm since it would have to read every block many times. Perhaps
> there's some way to adapt it or there might be some other O(n)
> algorithm which has better i/o patterns.
>
> But in cases where the tupleset/tuplesort is still in memory it would
> be easy to add support for pulling out the nth element and use that in
> a magic median() function. It wouldn't be a "classic" aggregate, at
> least as I understand it where you also need something like O(1) space
> to store the state, because you definitely need access to the whole
> tupleset to do the quickselect.
>

I can check speed diferences on orafce's median implementation. But
orafce isn't correct - it ignores work_mem limit - so it usable only
for some external modules or for testing. I'll try simple median
implementation based on aggregate API. Then I'll compare speed.

> If we don't find a way to optimize this for on-disk tuplestores though
> then I wonder whether it's worth it. It would only help in the narrow
> case that you had a large enough set to see the difference between
> doing quickselect and quicksort but small enough to fit in workmem. I
> suppose that case is widening over time though as memory sizes get
> bigger and bigger.

Thank you

I have to do same test and get more info about quickselect

Pavel

>
> --
> greg
>


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Pre ordered aggregates, default ORDER BY clause for aggregates - median support
Date: 2009-12-21 13:35:57
Message-ID: 407d949e0912210535l298b68ecu50a94d1a873b86e4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 21, 2009 at 11:43 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> If we don't find a way to optimize this for on-disk tuplestores though
>> then I wonder whether it's worth it. It would only help in the narrow
>> case that you had a large enough set to see the difference between
>> doing quickselect and quicksort but small enough to fit in workmem. I
>> suppose that case is widening over time though as memory sizes get
>> bigger and bigger.
>
> Thank you
>
> I have to do same test and get more info about quickselect

So my first thought of how to do median efficiently on large on-disk
data structures is to do a first pass, attempting to find as narrow a
target partition as possible that the median definitely falls in. Then
do a second pass counting how many tuples fall before and after the
target partition and accumulating all the tuples in the range and
perform quickselect to find the right tuple in the target partition. I
do wonder

To find a target partition you could use quickselect itself to find a
set of medians but I think it will boil down to a kind of huffman-like
tree encoding. You want to accumulate values as degenerate singleton
partitions until they don't fit in work_mem. When you exhaust work_mem
you collapse two adjacent "partitions" into a single partition
covering the whole range with a count of 2. You keep going collapsing
the two lowest count ranges each time (perhaps preferring to collapse
ranges far from the current median?). When you're done you can easily
determine which range contains the median. You then have to scan the
whole original input again skipping any tuple that falls outside that
partition. If it still doesn't fit in work_mem you have to repeat the
algorithm.

This looks like a lot of code for a narrow use case though. And it
would do a minimum of two passes through the input which is going to
mean it'll perform similarly to our regular tape sort until the inputs
get very large and tapesort needs to do multiple passes. At that point
it's possible this algorithm might need multiple passes as well,
though hopefully of smaller and smaller sets. I thought I would put it
down in an email in case anyone's interested in experimenting though.

Also, I assume I've just reinvented some standard algorithm I should
remember from school, though I can't tell which offhand. It looks kind
of like mergesort except since we're just doing selection we don't
need to actually do the merge step, just keep track of which values we
would have merged.

--
greg