Re: PoC: Partial sort

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Marti Raudsepp <marti(at)juffo(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>
Subject: Re: PoC: Partial sort
Date: 2014-08-19 10:02:57
Message-ID: CAApHDvraq8KirPj96LQZKC5jxsN=9o2J-VV-L6ZoKyt1F0OzQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 11, 2014 at 7:59 AM, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
wrote:
>
> Done. Patch is splitted.
>

I've started to look at this, and for now I'm still finding my way around
the patch, so I'm not quite there yet with understanding everything.
Never-the-less it seems best to post my comments early, so as to help
maintain concurrency between the review and getting the patch into shape.

I've only been looking at partial-sort-basic-1.patch so far;

The patch no longer applies to master, but this was only due to a tab being
replaced by 2 spaces in a pgident run. I've attached an updated patch which
currently applies without any issues.

Here's a few notes from reading over the code:

* pathkeys.c

EquivalenceMember *member = (EquivalenceMember *)
lfirst(list_head(key->pk_eclass->ec_members));

You can use linitial() instead of lfirst(list_head()). The same thing
occurs in costsize.c

* pathkeys.c

The following fragment:

n = pathkeys_common(root->query_pathkeys, pathkeys);

if (n != 0)
{
/* It's useful ... or at least the first N keys are */
return n;
}

return 0; /* path ordering not useful */
}

Could just read:

/* return the number of path keys in common, or 0 if there are none */
return pathkeys_common(root->query_pathkeys, pathkeys);

* execnodes.h

In struct SortState, some new fields don't have a comment.

I've also thrown a few different workloads at the patch and I'm very
impressed with most of the results. Especially when LIMIT is used, however
I've found a regression case which I thought I should highlight, but for
now I can't quite see what could be done to fix it.

create table a (x int not null, y int not null);
insert into a select x.x,y.y from generate_series(1,1000000) x(x) cross
join generate_series(1,10) y(y);

Patched:
explain analyze select x,y from a where x+0=1 order by x,y limit 10;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=92.42..163.21 rows=10 width=8) (actual
time=6239.426..6239.429 rows=10 loops=1)
-> Partial sort (cost=92.42..354064.37 rows=50000 width=8) (actual
time=6239.406..6239.407 rows=10 loops=1)
Sort Key: x, y
Presorted Key: x
Sort Method: quicksort Memory: 25kB
-> Index Scan using a_x_idx on a (cost=0.44..353939.13
rows=50000 width=8) (actual time=0.059..6239.319 rows=10 loops=1)
Filter: ((x + 0) = 1)
Rows Removed by Filter: 9999990
Planning time: 0.212 ms
Execution time: 6239.505 ms
(10 rows)

Time: 6241.220 ms

Unpatched:
explain analyze select x,y from a where x+0=1 order by x,y limit 10;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Limit (cost=195328.26..195328.28 rows=10 width=8) (actual
time=3077.759..3077.761 rows=10 loops=1)
-> Sort (cost=195328.26..195453.26 rows=50000 width=8) (actual
time=3077.757..3077.757 rows=10 loops=1)
Sort Key: x, y
Sort Method: quicksort Memory: 25kB
-> Seq Scan on a (cost=0.00..194247.77 rows=50000 width=8)
(actual time=0.018..3077.705 rows=10 loops=1)
Filter: ((x + 0) = 1)
Rows Removed by Filter: 9999990
Planning time: 0.510 ms
Execution time: 3077.837 ms
(9 rows)

Time: 3080.201 ms

As you can see, the patched version performs an index scan in order to get
the partially sorted results, but it does end up quite a bit slower than
the seqscan/sort that the unpatched master performs. I'm not quite sure how
realistic the x+0 = 1 WHERE clause is, but perhaps the same would happen if
something like x+y = 1 was performed too.... After a bit more analysis on
this, I see that if I change the 50k estimate to 10 in the debugger that
the num_groups is properly estimated at 1 and it then performs the seq scan
instead. So it looks like the costings of the patch are not to blame here.
(The 50k row estimate comes from rel tuples / DEFAULT_NUM_DISTINCT)

That's all I have at the moment... More to follow soon.

Regards

David Rowley

Attachment Content-Type Size
partial-sort-basic-1_rebased.patch application/octet-stream 55.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2014-08-19 10:18:47 Re: Reporting the commit LSN at commit time
Previous Message Rahila Syed 2014-08-19 09:37:03 Re: [REVIEW] Re: Compression of full-page-writes