Re: help tuning queries on large database

From: peter royal <peter(dot)royal(at)pobox(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: help tuning queries on large database
Date: 2006-01-09 17:23:15
Message-ID: 414814B8-556C-47DB-B364-A5A681CC3522@pobox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Jan 8, 2006, at 1:42 PM, Luke Lonergan wrote:
> Have you tested the underlying filesystem for it's performance?
> Run this:
> time bash -c 'dd if=/dev/zero of=/my_file_system/bigfile bs=8k
> count=<your_memory_size_in_GB * 250000> && sync'

This is a 2-disk RAID0

[root(at)bigboy /opt/alt-2]# time bash -c 'dd if=/dev/zero of=/opt/alt-2/
bigfile bs=8k count=1000000 && sync'
1000000+0 records in
1000000+0 records out

real 1m27.143s
user 0m0.276s
sys 0m37.338s

'iostat -x' showed writes peaking at ~100MB/s

> Then run this:
> time dd if=/my_file_system/bigfile bs=8k of=/dev/null

[root(at)bigboy /opt/alt-2]# time dd if=/opt/alt-2/bigfile bs=8k of=/dev/
null
1000000+0 records in
1000000+0 records out

real 1m9.846s
user 0m0.189s
sys 0m11.099s

'iostat -x' showed reads peaking at ~116MB/s

Again with kernel 2.6.15:

[root(at)bigboy ~]# time bash -c 'dd if=/dev/zero of=/opt/alt-2/bigfile
bs=8k count=1000000 && sync'
1000000+0 records in
1000000+0 records out

real 1m29.144s
user 0m0.204s
sys 0m48.415s

[root(at)bigboy ~]# time dd if=/opt/alt-2/bigfile bs=8k of=/dev/null
1000000+0 records in
1000000+0 records out

real 1m9.701s
user 0m0.168s
sys 0m11.933s

> And report the times here please. With your 8 disks in any of the
> RAID0
> configurations you describe, you should be getting 480MB/s. In the
> RAID10
> configuration you should get 240.

Not anywhere near that. I'm scouring the 'net looking to see what
needs to be tuned at the HW level.

> You should also experiment with using larger readahead, which you can
> implement like this:
> blockdev --setra 16384 /dev/<my_block_device>
>
> E.g. "blockdev --setra 16384 /dev/sda"

wow, this helped nicely. Without using the updated kernel, it took
28% off my testcase time.

> From what you describe, one of these is likely:
> - hardware isn't configured properly or a driver problem.

Using the latest Areca driver, looking to see if there is some
configuration that was missed.

> - you need to use xfs and tune your Linux readahead

Will try XFS soon, concentrating on the 'dd' speed issue first.

On Jan 8, 2006, at 4:35 PM, Ron wrote:
>> Areca ARC-1220 8-port PCI-E controller
>
> Make sure you have 1GB or 2GB of cache. Get the battery backup and
> set the cache for write back rather than write through.

The card we've got doesn't have a SODIMM socket, since its only an 8-
port card. My understanding was that was cache used when writing?

> A 2.6.12 or later based Linux distro should have NO problems using
> more than 4GB or RAM.

Upgraded the kernel to 2.6.15, then we were able to set the BIOS
option for the 'Memory Hole' to 'Software' and it saw all 4G (under
2.6.11 we got a kernel panic with that set)

>> RAID Layout:
>>
>> 4 2-disk RAID0 sets created
> You do know that a RAID 0 set provides _worse_ data protection than
> a single HD? Don't use RAID 0 for any data you want kept reliably.

yup, aware of that. was planning on RAID10 for production, but just
broke it out into RAID0 sets for testing (from what I read, I
gathered that the read performance of RAID0 and RAID10 were comparable)

thanks for all the suggestions, I'll report back as I continue testing.

-pete

--
(peter.royal|osi)@pobox.com - http://fotap.org/~osi

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2006-01-09 18:41:05 Re: 500x speed-down: Wrong query plan?
Previous Message Bernard Dhooghe 2006-01-09 17:10:02 >= forces row compare and not index elements compare when possible