Re: Sorting Improvements for 8.4

From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Michał Zaborowski <michal(dot)zaborowski(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sorting Improvements for 8.4
Date: 2007-12-20 23:20:48
Message-ID: 476AF8D0.80900@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeff Davis wrote:
> On Wed, 2007-12-19 at 15:51 -0500, Mark Mielke wrote:
>
>> That sounds possible, but I still feel myself suspecting that disk
>> reads will be much slower than localized text comparison. Perhaps I am
>> overestimating the performance of the comparison function?
>>
> I think this simple test will change your perceptions:
>
Yes - I received the same results (although my PostgreSQL doesn't have a
built in case ::text::bytea... :-) )

> On my machine this table fits easily in memory (so there aren't any disk
> reads at all). Sorting takes 7 seconds for floats, 9 seconds for binary
> data, and 20 seconds for localized text. That's much longer than it
> would take to read that data from disk, since it's only 70MB (which
> takes a fraction of a second on my machine).
>
Might this mean that PostgreSQL performs too many copy operations? :-)

> I think this disproves your hypothesis that sorting happens at disk
> speed.
>
Yes.

>> Yep - I started to read up on it. It still sounds like it's a hard-ish
>> problem (to achieve near N times speedup for N CPU cores without
>> degrading performance for existing loads), but that doesn't mean
>> impossible. :-)
>>
> You don't even need multiple cores to achieve a speedup, according to
> Ron's reference.
>
I think Ron's reference actually said that you don't need full cores to
achieve a speedup. It spoke of Intel's HT system. A single CPU with a
single execution pipeline is not going to function better with multiple
threads unless the single thread case is written wrong. Multiple threads
is always an overall loss without hardware support. The thinking on this
is that multiple threads can sometimes lead to cleaner designs, which
are sometimes more naturally written to be performing. In my experience,
the opposite is usually true.

But, if you do have HT, and the algorithm can be modified to take
advantage of it for an overall increase in speed - great.

Cheers,
mark

--
Mark Mielke <mark(at)mielke(dot)cc>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2007-12-20 23:33:13 Re: Sorting Improvements for 8.4
Previous Message Ron Mayer 2007-12-20 22:17:51 Re: Sorting Improvements for 8.4