Re: possible optimization: push down aggregates

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: possible optimization: push down aggregates
Date: 2014-08-27 19:48:34
Message-ID: CAFj8pRDiRxnipzAukjhUprmDsY5tW38H9yeXjdX4AMx60Kf+iA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2014-08-27 21:46 GMT+02:00 Claudio Freire <klaussfreire(at)gmail(dot)com>:

> On Wed, Aug 27, 2014 at 4:41 PM, Merlin Moncure <mmoncure(at)gmail(dot)com>
> wrote:
> > On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
> >> Hi
> >>
> >> one user asked about using a partitioning for faster aggregates queries.
> >>
> >> I found so there is not any optimization.
> >>
> >> create table x1(a int, d date);
> >> create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1);
> >> create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1);
> >> create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1);
> >>
> >> When I have this schema, then optimizer try to do
> >>
> >> postgres=# explain verbose select max(a) from x1 group by d order by d;
> >> QUERY PLAN
> >>
> --------------------------------------------------------------------------------
> >> GroupAggregate (cost=684.79..750.99 rows=200 width=8)
> >> Output: max(x1.a), x1.d
> >> Group Key: x1.d
> >> -> Sort (cost=684.79..706.19 rows=8561 width=8)
> >> Output: x1.d, x1.a
> >> Sort Key: x1.d
> >> -> Append (cost=0.00..125.60 rows=8561 width=8)
> >> -> Seq Scan on public.x1 (cost=0.00..0.00 rows=1
> width=8)
> >> Output: x1.d, x1.a
> >> -> Seq Scan on public.x_1 (cost=0.00..31.40 rows=2140
> >> width=8)
> >> Output: x_1.d, x_1.a
> >> -> Seq Scan on public.x_2 (cost=0.00..31.40 rows=2140
> >> width=8)
> >> Output: x_2.d, x_2.a
> >> -> Seq Scan on public.x_3 (cost=0.00..31.40 rows=2140
> >> width=8)
> >> Output: x_3.d, x_3.a
> >> -> Seq Scan on public.x_4 (cost=0.00..31.40 rows=2140
> >> width=8)
> >> Output: x_4.d, x_4.a
> >> Planning time: 0.333 ms
> >>
> >> It can be reduced to:
> >>
> >> sort by d
> >> Append
> >> Aggegate (a), d
> >> seq scan from x_1
> >> Aggregate (a), d
> >> seq scan from x_2
> >>
> >> Are there some plans to use partitioning for aggregation?
> >
> > Besides min/max, what other aggregates (mean/stddev come to mind)
> > would you optimize and how would you determine which ones could be?
> > Where is that decision made?
>
>
> You can't with mean and stddev, only with associative aggregates.
>
> That's min, max, sum, bit_and, bit_or, bool_and, bool_or, count.
>

I don't think

I have a partitions by X .. and my query has group by clause GROUP BY X

so I can calculate any aggregate

Pavel

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2014-08-27 19:52:06 Re: possible optimization: push down aggregates
Previous Message Pavel Stehule 2014-08-27 19:47:01 Re: possible optimization: push down aggregates