Re: Sort performance on large tables

From: Charlie Savage <cfis(at)interserv(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Sort performance on large tables
Date: 2005-11-08 22:06:04
Message-ID: dkr7ga$18a4$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks everyone for the feedback.

I tried increasing work_mem:

set work_mem to 300000;

select tlid, min(ogc_fid)
from completechain
group by tld;

The results are:

"GroupAggregate (cost=9041602.80..10003036.88 rows=48071704 width=8)
(actual time=4371749.523..5106162.256 rows=47599910 loops=1)"
" -> Sort (cost=9041602.80..9161782.06 rows=48071704 width=8) (actual
time=4371690.894..4758660.433 rows=48199165 loops=1)"
" Sort Key: tlid"
" -> Seq Scan on completechain (cost=0.00..2228584.04
rows=48071704 width=8) (actual time=49.518..805234.970 rows=48199165
loops=1)"
"Total runtime: 5279988.127 ms"

Thus the time decreased from 8486 seconds to 5279 seconds - which is a
nice improvement. However, that still leaves postgresql about 9 times
slower.

I tried increasing work_mem up to 500000, but at that point the machine
started using its swap partition and performance degraded back to the
original values.

Charlie

Richard Huxton wrote:
> Charlie Savage wrote:
>> Hi everyone,
>>
>> I have a question about the performance of sort.
>
>> Note it takes over 10 times longer to do the sort than the full
>> sequential scan.
>>
>> Should I expect results like this? I realize that the computer is
>> quite low-end and is very IO bound for this query, but I'm still
>> surprised that the sort operation takes so long.
>
> The sort will be spilling to disk, which will grind your I/O to a halt.
>
>> work_mem = 16384 # in Kb
>
> Try upping this. You should be able to issue "set work_mem = 100000"
> before running your query IIRC. That should let PG do its sorting in
> larger chunks.
>
> Also, if your most common access pattern is ordered via tlid look into
> clustering the table on that.

Richard Huxton wrote:
> Charlie Savage wrote:
>> Hi everyone,
>>
>> I have a question about the performance of sort.
>
>> Note it takes over 10 times longer to do the sort than the full
>> sequential scan.
>>
>> Should I expect results like this? I realize that the computer is
>> quite low-end and is very IO bound for this query, but I'm still
>> surprised that the sort operation takes so long.
>
> The sort will be spilling to disk, which will grind your I/O to a halt.
>
>> work_mem = 16384 # in Kb
>
> Try upping this. You should be able to issue "set work_mem = 100000"
> before running your query IIRC. That should let PG do its sorting in
> larger chunks.
>
> Also, if your most common access pattern is ordered via tlid look into
> clustering the table on that.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2005-11-08 22:26:21 Re: Sort performance on large tables
Previous Message Stephan Szabo 2005-11-08 20:48:45 Re: Sort performance on large tables