Re: Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions

From: Sameer Kumar <sameer(dot)kumar(at)ashnik(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions
Date: 2013-11-19 03:55:14
Message-ID: CADp-Sm4704TL3jv3+hq5k4+xza544y3NsdEtrdTMt9NSFCb7rw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 14, 2013 at 6:51 PM, Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> Hello,
>
> > When I read it again and try to relate, I get your point. Actually true,
> > hashes must always be performed as last option (if that is what you too
> > meant) and if there are few other operations they must be the last one to
> > be performed especially after sorting/grouping. Hashes must somehow make
> > use of already sorted data (I think this something even you indicated)
>
> Yes, some 'hash'es could preserve order selecting such a function
> for hash function. But at least PostgreSQL's 'HashAggregation'
> uses not-order-preserving function as hash function. So output
> cannot preserve input ordering.
>
> > I will do that if I get a DB2 system or Oracle system running. I will try
> > to replicate the same 2 test cases and share the plan. One thing which I
> am
> > sure is, the below part of the plan
> >
> > QUERY PLAN | Subquery Scan on __unnamed_subquery_0
> > (cost=12971.39..16964.99 rows=614 width=43) (actual
> > time=2606.075..2953.937 rows=558 loops=1)
> >
> > would be generated as RID scan in DB2 (which I have seen to perform
> better
> > than normal subquery scans in DB2).
>
> DB2's document says it is used for 'index ORing' corresponds OR
> or IN ops, which seems to be a relative to BitmapOr of
> PostgreSQL, perhaps not to HashAggregates/SemiJoin.
>
> I tried to imagin the plan for the group_by case with repeated
> index scan and merging..
>
> > select student_name
> > from student_score
> > where (course,score) in (select course,max(score)
> > from student_score group by course);
>
> Taking the advantage that the cardinarity of course is 8, this
> query could be transformed into 8 times of index scan and
> bitmaping.
>
> With hypothetical plan node LOOP, and BitmapScanAdd the plan
> could be,
>
> | BitmapHeapScan (rows = 155, loops = 1)
> | -> LOOP
> | ON Subquery (select distinct course from student_course) as c0
> | -> BitmapScanAdd (loops = 8)
> | BitmapCond: (student_score.score = x)
> | -> Limit (rows = 1, loops = 8) AS x
> | -> Unique (rows = 1, loops = 8)
> | -> IndexScan using idx_score on student_course (rows = 1,
> loops = 8)
> | Filter (student_course.course = c0)
>
> I suppose this is one possibility of what DB2 is doing. If DB2
> does the same optimization for ranking > 1 with the dense_rank()
> case, this plan might be like this,
>
> I can not be sure but this one seems logically correct from cost and
cardinality perspective(am not sure the operations that the DB2 planner
would generate ). Need to test it.

> | BitmapHeapScan (rows = 133, loops = 1)
> | -> LOOP
> | ON Subquery (select distinct course from student_course) as c0
> | -> BitmapScanAdd (loops = 8)
> | BitmapCond: (student_score.score = x)
> | -> Limit (rows = 1, loops = 8) AS x
> | -> Unique (rows = 2, loops = 8)
> | -> IndexScan using idx_score on student_course (rows =
> 18,loops= 8)
> | Filter (student_course.course = c0)
>
> Both plans surely seem to be done shortly for relatively small
> n's and number of courses.
>
> I guess they would do well even when the cardinality of courses is fairly
high (unless we hit a scenario where courses offered are more than/in the
same decimal range as students opting for them).

On the other hand, using semijoin as PostgreSQL does, creating
> HashAggregate storing nth place score for every course requires
> some memory to work on for each course.
>
> | Hash Semi Join
> | Hash Cond: (a.course = b.course and a.score = b.score)
> | -> SeqScan on student_score as a
> | -> Hash
> | -> HashAggregatesFunc (rows = 8)
> | Key = course, func = rankn(dense_rank(), n, key, val)
> | -> SeqScan on student_score (rows = 122880)
>
> Where, rankn() must keep socres down to nth rank and emits nth
> score as final value. I don't get more generic form for this
> mechanism right now, and the value to do in this specific manner
> seems not so much..
>
>
> I feel the advantage could be more when dealing with a DW environment
which has more complex aggregate and windowing queries. Extending this to
other windowing function, it could be a great gain for DW and OLAP queries.

Regards
Sameer

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Wood 2013-11-19 04:24:01 lock on object is already held
Previous Message David Johnston 2013-11-19 03:40:42 Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block