Re: Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions

From: Sameer Kumar <sameer(dot)kumar(at)ashnik(dot)com>
To: David Johnston <polobo(at)yahoo(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions
Date: 2013-10-24 04:01:57
Message-ID: CADp-Sm6-pVk9P7tLrme=CYmBjXzYNzu+WNbSA1V_NbTrMFWoKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Agree that windowing function will return all the rows compared to max and
group by returing only max rows per group. But even while arriving at the
aggregate/sorting windowing function seems to spend more effort than group
by/order by.

I am just trying to see if we could somehow optimize the way windowing
operations are performed. (May be in query rewrite). Datawarehouses could
use that improvement.
Its not my production box, so I can live with disk sort. I have tried with
huge sorting memory but still I see a similar difference in cost of sorting
for grouping/ordering Vs windowing function.
Another thing regarding work_memory, I have generally seen that windowing
functions expect more amount of memory for sorting compared to
grouping/ordering clauses.
On 24 Oct 2013 10:54, "David Johnston" <polobo(at)yahoo(dot)com> wrote:

> Sameer Kumar wrote
> > edb=# explain analyze select max(score) from student_score group by
> > course;
>
> This query returns 6 records. The window one returns 123,000. Why do you
> expect these to have anywhere near the same performance or plan?
>
> You can enable/disable indexes/scans to see what alternatives plans may
> provide but nothing here stands out as being obviously incorrect.
>
> I'm not really clear on what your question is. Generally it sounds as if
> you are wondering if there are any plans to I prove the algorithms behind
> window function processing. Are you just looking at symptoms and thus
> possibly have unreasonable expectations or do you actually see an avenue
> for
> improvement in the engine?
>
>
> > QUERY PLAN | Sort Method: external merge Disk: 7576kB
>
> Work memory; I/O is killing your performance on this query. It is more
> flexible but you pay a price for that.
>
>
> > Another thing, (I may be stupid and naive here) does PostgreSQL re-uses
> > the
> > hash which has been already created for sort. In this case the inner
> query
> > must have created a hash for windoing aggregate. Can't we use that same
> > one
> > while applying the the filter "rn=1" ?
>
> Probably but others more knowledgable will need to answer authoritatively.
>
> David J.
>
>
>
>
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/Using-indexes-for-ORDER-BY-and-PARTITION-BY-clause-in-windowing-functions-tp5775605p5775708.html
> Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2013-10-24 04:05:29 Re: high-dimensional knn-GIST tests (was Re: Cube extension kNN support)
Previous Message Alvaro Herrera 2013-10-24 03:52:16 Re: Add min and max execute statement time in pg_stat_statement