Re: Sorting Improvements for 8.4

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Jeff Davis" <pgsql(at)j-davis(dot)com>
Cc: "Mark Mielke" <mark(at)mark(dot)mielke(dot)cc>, Michał Zaborowski <michal(dot)zaborowski(at)gmail(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Ron Mayer" <rm_pg(at)cheapcomplexdevices(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sorting Improvements for 8.4
Date: 2007-12-20 01:26:13
Message-ID: 87wsra9mx6.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


"Jeff Davis" <pgsql(at)j-davis(dot)com> writes:

> test=> explain analyze select * from sorter order by t;
> test=> explain analyze select * from sorter order by b;
> test=> explain analyze select * from sorter order by f;
>
> On my machine this table fits easily in memory (so there aren't any disk
> reads at all). Sorting takes 7 seconds for floats, 9 seconds for binary
> data, and 20 seconds for localized text. That's much longer than it
> would take to read that data from disk, since it's only 70MB (which
> takes a fraction of a second on my machine).
>
> I think this disproves your hypothesis that sorting happens at disk
> speed.

I suspect most of that is spent just copying the data around. Which would not
be helped by having multiple threads doing the copying -- and in fact might be
exacerbated if it required an extra copy to consolidate all the data in the
end.

How long does a "explain analyze sinmple select * from sorter" take?

And assuming you're doing disk sorts (in disk cache) you're doing quite a lot
of copying to temporary files (in disk cache) and then back to memory.

Note that speeding up a query from 20s to 5s isn't terribly useful. If it's
OLTP you can't be using all your cores for each user anyways. And if it's DSS
20s isn't a problem.

Where parallel processing like this becomes attractive is when you're running
a 2 hour query on a machine sequentially running scheduled batch jobs which
can be sped up to 30 minutes. But in that case you're almost certainly being
limited by your disk bandwidth, not your cpu speed.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dann Corbit 2007-12-20 01:42:02 Re: Sorting Improvements for 8.4
Previous Message Tom Lane 2007-12-20 01:02:30 Re: Sorting Improvements for 8.4