Re: Sort performance

Lists: pgsql-hackers
From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Sort performance
Date: 2006-09-01 14:16:01
Message-ID: 87psefisf2.fsf@stark.enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


I'm not sure if this is good news or bad news. Either some kudos are due to
the gang that worked on the external sort performance or something's very
wrong with the qsort implementation in glibc because I'm seeing Postgres's
external sort perform better than qsort.

This is despite Postgres external sorts having to execute filesystem calls
pushing buffers back and forth between user-space and kernel-space, which
seems hard to believe. I feel like something's got to be pretty far wrong with
the qsort call here for this to be possible.

At first I chalked this up to qsort having O(n^2) behaviour occasionally but
a) This is glibc where qsort is actually mergesort which should behave pretty
similarly to Postgres's mergesort and b) the input data is randomized pretty
well so it really ought be a problem even were it qsort.

Mem Runs Time
---- ---- ----
1MB 18 8.25s
10MB 3 5.6s
100MB qsort 6.1s

The input is a table with one column, a text field. It contains
/usr/share/dict/words ordered by random() and then repeated a bunch of times.
(Sorry about the imprecision, I set this table up a while ago and don't
remember exactly what I did). a

The machine has plenty of RAM and isn't swapping or running any other
services.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sort performance
Date: 2006-09-01 15:57:16
Message-ID: 5444.1157126236@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> Mem Runs Time
> ---- ---- ----
> 1MB 18 8.25s
> 10MB 3 5.6s
> 100MB qsort 6.1s

I'm confused what this means exactly? Are you saying that in the first
two cases, there were 18 and 3 sorted runs generated in the initial
pass, and in the third case we just did the sort in memory using qsort?

How many items are being sorted, exactly? Since it's text, it probably
also makes a big difference what LC_COLLATE setting you are using.
Non-C sort locale could mean that the strcoll() calls swamp all else.

How long does it take sort(1) to do the same task?

regards, tom lane