Re: Using IOZone to simulate DB access patterns

Lists: pgsql-performance
From: henk de wit <henk53602(at)hotmail(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Cc: <josh(at)agliodbs(dot)com>
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-04 10:00:52
Message-ID: COL104-W563A53910118E7079DBE5F5860@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance


> I've been using Bonnie++ for ages to do filesystem testing of new DB servers. But Josh Drake recently turned me on to IOZone.

Perhaps a little off-topic here, but I'm assuming you are using Linux to test your DB server (since you mention Bonnie++). But it seems to me that IOZone only has a win32 client. How did you actually run IOZone on Linux?
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/


From: Jesper Krogh <jesper(at)krogh(dot)cc>
To: henk de wit <henk53602(at)hotmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org, josh(at)agliodbs(dot)com
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-04 10:49:52
Message-ID: 49D73B50.1010007@krogh.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

henk de wit wrote:
>> I've been using Bonnie++ for ages to do filesystem testing of new DB servers. But Josh Drake recently turned me on to IOZone.
>
> Perhaps a little off-topic here, but I'm assuming you are using Linux to
> test your DB server (since you mention Bonnie++). But it seems to me
> that IOZone only has a win32 client. How did you actually run IOZone on
> Linux?

$ apt-cache search iozone
iozone3 - Filesystem and Disk Benchmarking Tool

--
Jesper


From: henk de wit <henk53602(at)hotmail(dot)com>
To: <jesper(at)krogh(dot)cc>
Cc: <pgsql-performance(at)postgresql(dot)org>, <josh(at)agliodbs(dot)com>
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-04 15:54:43
Message-ID: COL104-W79AC894F57255090CE4EB1F5860@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

> $ apt-cache search iozone
> iozone3 - Filesystem and Disk Benchmarking Tool

You are right. I was confused with IOMeter, which can't be run on Linux (the Dynamo part can, but that's not really useful without the 'command & control' part).
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: henk de wit <henk53602(at)hotmail(dot)com>
Cc: jesper(at)krogh(dot)cc, pgsql-performance(at)postgresql(dot)org
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-10 05:41:47
Message-ID: 49DEDC1B.7080303@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

All,

Wow, am I really the only person here who's used IOZone?

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: henk de wit <henk53602(at)hotmail(dot)com>, jesper(at)krogh(dot)cc, pgsql-performance(at)postgresql(dot)org
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-10 06:26:58
Message-ID: 49DEE6B2.7030602@paradise.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Josh Berkus wrote:
> All,
>
> Wow, am I really the only person here who's used IOZone?
>

No - I used to use it exclusively, but everyone else tended to demand I
redo stuff with bonnie before taking any finding seriously... so I've
kinda 'submitted to the Borg' as it were....


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>
Cc: henk de wit <henk53602(at)hotmail(dot)com>, jesper(at)krogh(dot)cc, pgsql-performance(at)postgresql(dot)org
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-10 17:10:03
Message-ID: 49DF7D6B.9020107@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On 4/9/09 11:26 PM, Mark Kirkwood wrote:
> Josh Berkus wrote:
>> All,
>>
>> Wow, am I really the only person here who's used IOZone?
>>
>
> No - I used to use it exclusively, but everyone else tended to demand I
> redo stuff with bonnie before taking any finding seriously... so I've
> kinda 'submitted to the Borg' as it were....

Bonnie++ has its own issues with concurrency; it's using some kind of
ad-hoc threading implementation, which results in not getting real
parallelism. I just did a test with -c 8 on Bonnie++ 1.95, and the
program only ever used 3 cores.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Scott Carey <scott(at)richrelevance(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>, henk de wit <henk53602(at)hotmail(dot)com>
Cc: "jesper(at)krogh(dot)cc" <jesper(at)krogh(dot)cc>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-10 17:11:46
Message-ID: C604CBE2.4856%scott@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

I've switched to using FIO.

Bonnie in my experience produces poor results and is better suited to
testing desktop/workstation type load. Most of its tests don't apply to how
postgres writes/reads anyway.

IOZone is a bit more troublesome to get it to work on the file(s) you want
under concurrency and is also hard to get it to avoid the OS file cache. On
systems with lots of RAM, it takes too long as a result. I personally like
it better than bonnnie by far, but its not flexible enough for me and is
often used by hardware providers to 'show' theier RAID cards are doing fine
(PERC 6 doing 4GB /sec file access -- see! Its fine!) but the thing is just
testing in memory cached reads for most of the test or all if not configured
right...

FIO with profiles such as the below samples are easy to set up, and they can
be mix/matched to test what happens with mixed read/write seq/rand -- with
surprising and useful tuning results. Forcing a cache flush or sync before
or after a run is trivial. Changing to asynchronous I/O, direct I/O, or
other forms is trivial. The output result formatting is very useful as
well.

I got into using FIO when I needed to test a matrix of about 400 different
tuning combinations. This would have taken a month with Iozone, but I could
create my profiles with FIO, force the OS cache to flush, and constrain the
time appropriately for each test, and run the batch overnight.

#----------------
[read-rand]
rw=randread
; this will be total of all individual files per process
size=1g
directory=/data/test
fadvise_hint=0
blocksize=8k
direct=0
ioengine=sync
iodepth=1
numjobs=32
; this is number of files total per process
nrfiles=1
group_reporting=1
runtime=1m
exec_prerun=echo 3 > /proc/sys/vm/drop_caches
#--------------------
[read]
rw=read
; this will be total of all individual files per process
size=512m
directory=/data/test
fadvise_hint=0
blocksize=8k
direct=0
ioengine=sync
iodepth=1
numjobs=8
; this is number of files total per process
nrfiles=1
runtime=30s
group_reporting=1
exec_prerun=echo 3 > /proc/sys/vm/drop_caches

#----------------------
[write]
rw=write
; this will be total of all individual files per process
size=4g
directory=/data/test
fadvise_hint=0
blocksize=8k
direct=0
ioengine=sync
iodepth=1
numjobs=1
;rate=10000
; this is number of files total per process
nrfiles=1
runtime=48s
group_reporting=1
end_fsync=1
exec_prerun=echo 3 >sync; /proc/sys/vm/drop_caches

On 4/9/09 10:41 PM, "Josh Berkus" <josh(at)agliodbs(dot)com> wrote:

> All,
>
> Wow, am I really the only person here who's used IOZone?
>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> www.pgexperts.com
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Scott Carey <scott(at)richrelevance(dot)com>
Cc: henk de wit <henk53602(at)hotmail(dot)com>, jesper(at)krogh(dot)cc, pgsql-performance(at)postgresql(dot)org
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-10 17:31:35
Message-ID: 49DF8277.4050801@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Scott,

> FIO with profiles such as the below samples are easy to set up, and they can
> be mix/matched to test what happens with mixed read/write seq/rand -- with
> surprising and useful tuning results. Forcing a cache flush or sync before
> or after a run is trivial. Changing to asynchronous I/O, direct I/O, or
> other forms is trivial. The output result formatting is very useful as
> well.

FIO? Link?

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Scott Carey <scott(at)richrelevance(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: henk de wit <henk53602(at)hotmail(dot)com>, "jesper(at)krogh(dot)cc" <jesper(at)krogh(dot)cc>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-10 17:40:46
Message-ID: C604D2AE.486A%scott@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance


On 4/10/09 10:31 AM, "Josh Berkus" <josh(at)agliodbs(dot)com> wrote:

> Scott,
>
>> FIO with profiles such as the below samples are easy to set up, and they can
>> be mix/matched to test what happens with mixed read/write seq/rand -- with
>> surprising and useful tuning results. Forcing a cache flush or sync before
>> or after a run is trivial. Changing to asynchronous I/O, direct I/O, or
>> other forms is trivial. The output result formatting is very useful as
>> well.
>
> FIO? Link?

First google result:
http://freshmeat.net/projects/fio/

Written by Jens Axobe, the Linux Kernel I/O block layer maintainer. He
wrote the CFQ scheduler and Noop scheduler, and is the author of blktrace as
well.

" fio is an I/O tool meant to be used both for benchmark and stress/hardware
verification. It has support for 13 different types of I/O engines (sync,
mmap, libaio, posixaio, SG v3, splice, null, network, syslet, guasi,
solarisaio, and more), I/O priorities (for newer Linux kernels), rate I/O,
forked or threaded jobs, and much more. It can work on block devices as well
as files. fio accepts job descriptions in a simple-to-understand text
format. Several example job files are included. fio displays all sorts of
I/O performance information. It supports Linux, FreeBSD, and OpenSolaris"

>
>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> www.pgexperts.com
>


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Scott Carey <scott(at)richrelevance(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, henk de wit <henk53602(at)hotmail(dot)com>, "jesper(at)krogh(dot)cc" <jesper(at)krogh(dot)cc>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-10 18:01:33
Message-ID: alpine.GSO.2.01.0904101359490.21946@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Fri, 10 Apr 2009, Scott Carey wrote:

> FIO with profiles such as the below samples are easy to set up

There are some more sample FIO profiles with results from various
filesystems at
http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Scott Carey <scott(at)richrelevance(dot)com>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, henk de wit <henk53602(at)hotmail(dot)com>, "jesper(at)krogh(dot)cc" <jesper(at)krogh(dot)cc>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-10 18:17:39
Message-ID: C604DB53.4877%scott@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance


On 4/10/09 11:01 AM, "Greg Smith" <gsmith(at)gregsmith(dot)com> wrote:

> On Fri, 10 Apr 2009, Scott Carey wrote:
>
>> FIO with profiles such as the below samples are easy to set up
>
> There are some more sample FIO profiles with results from various
> filesystems at
> http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide

I wish to thank Greg here as many of my profile variations came from the
above as a starting point.

Note in his results the XFS file system behavior on random writes is due to
FIO doing 'sparse writes' (which Postgres does not do, and fio exposes some
issues on xfs with) in the default random write mode. To properly simulate
Postgres these should be random overwrites.

Add 'overwrite=true' to the profile for random writes and the whole file
will be allocated before randomly (over)writing to it.

Here is the man page:
http://linux.die.net/man/1/fio

>
> --
> * Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD
>


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Scott Carey <scott(at)richrelevance(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, henk de wit <henk53602(at)hotmail(dot)com>, "jesper(at)krogh(dot)cc" <jesper(at)krogh(dot)cc>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-10 19:25:01
Message-ID: alpine.GSO.2.01.0904101524110.10958@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Fri, 10 Apr 2009, Scott Carey wrote:

> I wish to thank Greg here as many of my profile variations came from the
> above as a starting point.

That page was mainly Mark Wong's work, I just remembered where it was.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: "M(dot) Edward (Ed) Borasky" <zznmeb(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-11 00:03:37
Message-ID: af0420cd0904101703k3a150bd6o35ff2b8c7dc812de@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

I've done quite a bit with IOzone, but if you're on Linux, you have lots of
options. In particular, you can actually capture I/O patterns from a running
application with blktrace, and then replay them with btrecord / btreplay.

The documentation for this stuff is a bit hard to find. Some of the distros
don't install it by default. But have a look at

http://ow.ly/2zyW

for some "Getting Started" info.
--
M. Edward (Ed) Borasky
http://www.linkedin.com/in/edborasky

I've never met a happy clam. In fact, most of them were pretty steamed.


From: Mark Wong <markwkm(at)gmail(dot)com>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Scott Carey <scott(at)richrelevance(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, henk de wit <henk53602(at)hotmail(dot)com>, "jesper(at)krogh(dot)cc" <jesper(at)krogh(dot)cc>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, Selena Deckelmann <selenamarie(at)gmail(dot)com>, Gabrielle Roth <gorthx(at)gmail(dot)com>
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-11 18:44:33
Message-ID: 70c01d1d0904111144m50c569bbh7dc32f2bce36195e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Fri, Apr 10, 2009 at 11:01 AM, Greg Smith <gsmith(at)gregsmith(dot)com> wrote:
> On Fri, 10 Apr 2009, Scott Carey wrote:
>
>> FIO with profiles such as the below samples are easy to set up
>
> There are some more sample FIO profiles with results from various
> filesystems at
> http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide

There's a couple of potential flaws I'm trying to characterize this
weekend. I'm having second thoughts about how I did the sequential
read and write profiles. Using multiple processes doesn't let it
really do sequential i/o. I've done one comparison so far resulting
in about 50% more throughput using just one process to do sequential
writes. I just want to make sure there shouldn't be any concern for
being processor bound on one core.

The other flaw is having a minimum run time. The max of 1 hour seems
to be good to establishing steady system utilization, but letting some
tests finish in less than 15 minutes doesn't provide "good" data.
"Good" meaning looking at the time series of data and feeling
confident it's a reliable result. I think I'm describing that
correctly...

Regards,
Mark


From: Scott Carey <scott(at)richrelevance(dot)com>
To: Mark Wong <markwkm(at)gmail(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, henk de wit <henk53602(at)hotmail(dot)com>, "jesper(at)krogh(dot)cc" <jesper(at)krogh(dot)cc>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, Selena Deckelmann <selenamarie(at)gmail(dot)com>, Gabrielle Roth <gorthx(at)gmail(dot)com>
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-12 02:00:07
Message-ID: C6069937.4939%scott@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On 4/11/09 11:44 AM, "Mark Wong" <markwkm(at)gmail(dot)com> wrote:

> On Fri, Apr 10, 2009 at 11:01 AM, Greg Smith <gsmith(at)gregsmith(dot)com> wrote:
>> On Fri, 10 Apr 2009, Scott Carey wrote:
>>
>>> FIO with profiles such as the below samples are easy to set up
>>
>> There are some more sample FIO profiles with results from various
>> filesystems at
>> http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide
>
> There's a couple of potential flaws I'm trying to characterize this
> weekend. I'm having second thoughts about how I did the sequential
> read and write profiles. Using multiple processes doesn't let it
> really do sequential i/o. I've done one comparison so far resulting
> in about 50% more throughput using just one process to do sequential
> writes. I just want to make sure there shouldn't be any concern for
> being processor bound on one core.

FWIW, my raid array will do 1200MB/sec, and no tool I've used can saturate
it without at least two processes. 'dd' and fio can get close (1050MB/sec),
if the block size is <= ~32k <=64k. With a postgres sized 8k block 'dd'
can't top 900MB/sec or so. FIO can saturate it only with two+ readers.

I optimized my configuration for 4 concurrent sequential readers with 4
concurrent random readers, and this helped the overall real world
performance a lot. I would argue that on any system with concurrent
queries, concurrency of all types is important to measure. Postgres isn't
going to hold up one sequential scan to wait for another. Postgres on a
3.16Ghz CPU is CPU bound on a sequential scan at between 250MB/sec and
800MB/sec on the type of tables/queries I have. Concurrent sequential
performance was affected by:
Xfs -- the gain over ext3 was large
Readahead tuning -- about 2MB per spindle was optimal (20MB for me, sw raid
0 on 2x[10 drive hw raid 10]).
Deadline scheduler (big difference with concurrent sequential + random
mixed).

One reason your tests write so much faster than they read was the linux
readahead value not being tuned as you later observed. This helps ext3 a
lot, and xfs enough so that fio single threaded was faster than 'dd' to the
raw device.

>
> The other flaw is having a minimum run time. The max of 1 hour seems
> to be good to establishing steady system utilization, but letting some
> tests finish in less than 15 minutes doesn't provide "good" data.
> "Good" meaning looking at the time series of data and feeling
> confident it's a reliable result. I think I'm describing that
> correctly...

It really depends on the specific test though. You can usually get random
iops numbers that are realistic in a fairly short time, and 1 minute long
tests for me vary by about 3% (which can be +-35MB/sec in my case).

I ran my tests on a partition that was only 20% the size of the whole
volume, and at the front of it. Sequential transfer varies by a factor of 2
across a SATA disk from start to end, so if you want to compare file systems
fairly on sequential transfer rate you have to limit the partition to an
area with relatively constant STR or else one file system might win just
because it placed your file earlier on the drive.

>
> Regards,
> Mark
>


From: Mark Wong <markwkm(at)gmail(dot)com>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Scott Carey <scott(at)richrelevance(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, henk de wit <henk53602(at)hotmail(dot)com>, "jesper(at)krogh(dot)cc" <jesper(at)krogh(dot)cc>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, Selena Deckelmann <selenamarie(at)gmail(dot)com>, Gabrielle Roth <gorthx(at)gmail(dot)com>
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-27 03:28:13
Message-ID: 70c01d1d0904262028j42fc4940s116fc14ec26d2e35@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Sat, Apr 11, 2009 at 11:44 AM, Mark Wong <markwkm(at)gmail(dot)com> wrote:
> On Fri, Apr 10, 2009 at 11:01 AM, Greg Smith <gsmith(at)gregsmith(dot)com> wrote:
>> On Fri, 10 Apr 2009, Scott Carey wrote:
>>
>>> FIO with profiles such as the below samples are easy to set up
>>
>> There are some more sample FIO profiles with results from various
>> filesystems at
>> http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide
>
> There's a couple of potential flaws I'm trying to characterize this
> weekend.  I'm having second thoughts about how I did the sequential
> read and write profiles.  Using multiple processes doesn't let it
> really do sequential i/o.  I've done one comparison so far resulting
> in about 50% more throughput using just one process to do sequential
> writes.  I just want to make sure there shouldn't be any concern for
> being processor bound on one core.
>
> The other flaw is having a minimum run time.  The max of 1 hour seems
> to be good to establishing steady system utilization, but letting some
> tests finish in less than 15 minutes doesn't provide "good" data.
> "Good" meaning looking at the time series of data and feeling
> confident it's a reliable result.  I think I'm describing that
> correctly...

FYI, I've updated the wiki with the parameters I'm running with now.
I haven't updated the results yet though.

Regards,
Mark


From: Mark Wong <markwkm(at)gmail(dot)com>
To: Scott Carey <scott(at)richrelevance(dot)com>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, henk de wit <henk53602(at)hotmail(dot)com>, "jesper(at)krogh(dot)cc" <jesper(at)krogh(dot)cc>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, Selena Deckelmann <selenamarie(at)gmail(dot)com>, Gabrielle Roth <gorthx(at)gmail(dot)com>
Subject: Re: Using IOZone to simulate DB access patterns
Date: 2009-04-27 03:44:51
Message-ID: 70c01d1d0904262044x20b1d282sfbb16b250d26e722@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Sat, Apr 11, 2009 at 7:00 PM, Scott Carey <scott(at)richrelevance(dot)com> wrote:
>
>
> On 4/11/09 11:44 AM, "Mark Wong" <markwkm(at)gmail(dot)com> wrote:
>
>> On Fri, Apr 10, 2009 at 11:01 AM, Greg Smith <gsmith(at)gregsmith(dot)com> wrote:
>>> On Fri, 10 Apr 2009, Scott Carey wrote:
>>>
>>>> FIO with profiles such as the below samples are easy to set up
>>>
>>> There are some more sample FIO profiles with results from various
>>> filesystems at
>>> http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide
>>
>> There's a couple of potential flaws I'm trying to characterize this
>> weekend.  I'm having second thoughts about how I did the sequential
>> read and write profiles.  Using multiple processes doesn't let it
>> really do sequential i/o.  I've done one comparison so far resulting
>> in about 50% more throughput using just one process to do sequential
>> writes.  I just want to make sure there shouldn't be any concern for
>> being processor bound on one core.
>
> FWIW, my raid array will do 1200MB/sec, and no tool I've used can saturate
> it without at least two processes.  'dd' and fio can get close (1050MB/sec),
> if the block size is <= ~32k <=64k.  With a postgres sized 8k block 'dd'
> can't top 900MB/sec or so. FIO can saturate it only with two+ readers.
>
> I optimized my configuration for 4 concurrent sequential readers with 4
> concurrent random readers, and this helped the overall real world
> performance a lot.  I would argue that on any system with concurrent
> queries, concurrency of all types is important to measure.  Postgres isn't
> going to hold up one sequential scan to wait for another.  Postgres on a
> 3.16Ghz CPU is CPU bound on a sequential scan at between 250MB/sec and
> 800MB/sec on the type of tables/queries I have.  Concurrent sequential
> performance was affected by:
> Xfs -- the gain over ext3 was large
> Readahead tuning -- about 2MB per spindle was optimal (20MB for me, sw raid
> 0 on 2x[10 drive hw raid 10]).
> Deadline scheduler (big difference with concurrent sequential + random
> mixed).
>
> One reason your tests write so much faster than they read was the linux
> readahead value not being tuned as you later observed.  This helps ext3 a
> lot, and xfs enough so that fio single threaded was faster than 'dd' to the
> raw device.
>
>>
>> The other flaw is having a minimum run time.  The max of 1 hour seems
>> to be good to establishing steady system utilization, but letting some
>> tests finish in less than 15 minutes doesn't provide "good" data.
>> "Good" meaning looking at the time series of data and feeling
>> confident it's a reliable result.  I think I'm describing that
>> correctly...
>
> It really depends on the specific test though.  You can usually get random
> iops numbers that are realistic in a fairly short time, and 1 minute long
> tests for me vary by about 3% (which can be +-35MB/sec in my case).
>
> I ran my tests on a partition that was only 20% the size of the whole
> volume, and at the front of it.  Sequential transfer varies by a factor of 2
> across a SATA disk from start to end, so if you want to compare file systems
> fairly on sequential transfer rate you have to limit the partition to an
> area with relatively constant STR or else one file system might win just
> because it placed your file earlier on the drive.

That's probably what is going with the 1 disk test:

http://207.173.203.223/~markwkm/community10/fio/linux-2.6.28-gentoo/1-disk-raid0/ext2/seq-read/io-charts/iostat-rMB.s.png

versus the 4 disk test:

http://207.173.203.223/~markwkm/community10/fio/linux-2.6.28-gentoo/4-disk-raid0/ext2/seq-read/io-charts/iostat-rMB.s.png

These are the throughput numbs but the iops are in the same directory.

Regards,
Mark