From: | Peter Geoghegan <pg(at)heroku(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Greg S <stark(at)mit(dot)edu> |
Subject: | Re: Using quicksort for every external sort run |
Date: | 2016-01-29 17:46:47 |
Message-ID: | CAM3SWZR14_4kswgoWLZTkQsAtTCNo4VbjYcSAhbE0MKNOX1m1w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 29, 2016 at 9:24 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I feel like this could be data driven. I mean, the cost model is
> based mainly on the tuple width and the size of the SortTuple array.
> So, it should be possible to tests of both algorithms on 32, 64, 96,
> 128, ... byte tuples with a SortTuple array that is 256MB, 512MB,
> 768MB, 1GB, ... Then we can judge how closely the cost model comes to
> mimicking the actual behavior.
You would also need to represent how much of the input actually ended
up being sorted with the heap in each case. Maybe that could be tested
at 50% (bad for "quicksort with spillover"), 25% (better), and 5%
(good).
An alternative approach that might be acceptable is to add a generic,
conservative 90% threshold (so 10% of tuples sorted by heap).
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2016-01-29 20:55:55 | Re: Sequence Access Method WIP |
Previous Message | Thom Brown | 2016-01-29 17:43:02 | Re: [WIP] Effective storage of duplicates in B-tree index. |