Re: RAID stripe size question

Lists: pgsql-performance
From: "Mikael Carneholm" <Mikael(dot)Carneholm(at)WirelessCar(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: RAID stripe size question
Date: 2006-07-16 22:52:17
Message-ID: 7F10D26ECFA1FB458B89C5B4B0D72C2B4E4BB1@sesrv12.wirelesscar.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

I have finally gotten my hands on the MSA1500 that we ordered some time
ago. It has 28 x 10K 146Gb drives, currently grouped as 10 (for wal) +
18 (for data). There's only one controller (an emulex), but I hope
performance won't suffer too much from that. Raid level is 0+1,
filesystem is ext3.

Now to the interesting part: would it make sense to use different stripe
sizes on the separate disk arrays? In theory, a smaller stripe size
(8-32K) should increase sequential write throughput at the cost of
decreased positioning performance, which sounds good for WAL (assuming
WAL is never "searched" during normal operation). And for disks holding
the data, a larger stripe size (>32K) should provide for more concurrent
(small) reads/writes at the cost of decreased raw throughput. This is
with an OLTP type application in mind, so I'd rather have high
transaction throughput than high sequential read speed. The interface is
a 2Gb FC so I'm throttled to (theoretically) 192Mb/s, anyway.

So, does this make sense? Has anyone tried it and seen any performance
gains from it?

Regards,
Mikael.


From: "Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: RAID stripe size question
Date: 2006-07-16 23:10:05
Message-ID: 20060716231005.GA4746@uio.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Mon, Jul 17, 2006 at 12:52:17AM +0200, Mikael Carneholm wrote:
> Now to the interesting part: would it make sense to use different stripe
> sizes on the separate disk arrays? In theory, a smaller stripe size
> (8-32K) should increase sequential write throughput at the cost of
> decreased positioning performance, which sounds good for WAL (assuming
> WAL is never "searched" during normal operation).

For large writes (ie. sequential write throughput), it doesn't really matter
what the stripe size is; all the disks will have to both seek and write
anyhow.

/* Steinar */
--
Homepage: http://www.sesse.net/


From: Michael Stone <mstone+postgres(at)mathom(dot)us>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: RAID stripe size question
Date: 2006-07-17 00:03:40
Message-ID: 20060717000337.GA8069@mathom.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Mon, Jul 17, 2006 at 12:52:17AM +0200, Mikael Carneholm wrote:
>I have finally gotten my hands on the MSA1500 that we ordered some time
>ago. It has 28 x 10K 146Gb drives, currently grouped as 10 (for wal) +
>18 (for data). There's only one controller (an emulex), but I hope

You've got 1.4TB assigned to the WAL, which doesn't normally have more
than a couple of gigs?

Mike Stone


From: "Alex Turner" <armtuk(at)gmail(dot)com>
To: "Mikael Carneholm" <Mikael(dot)Carneholm(at)wirelesscar(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: RAID stripe size question
Date: 2006-07-17 06:13:04
Message-ID: 33c6269f0607162313k1770ac16ld5e90e106bf5e8dc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

With 18 disks dedicated to data, you could make 100/7*9 seeks/second (7ms
av seeks time, 9 independant units) which is 128seeks/second writing on
average 64kb of data, which is 4.1MB/sec throughput worst case, probably 10x
best case so 40Mb/sec - you might want to take more disks for your data and
less for your WAL.

Someone check my math here...

And as always - run benchmarks with your app to verify

Alex.

On 7/16/06, Mikael Carneholm <Mikael(dot)Carneholm(at)wirelesscar(dot)com> wrote:
>
> I have finally gotten my hands on the MSA1500 that we ordered some time
> ago. It has 28 x 10K 146Gb drives, currently grouped as 10 (for wal) + 18
> (for data). There's only one controller (an emulex), but I hope performance
> won't suffer too much from that. Raid level is 0+1, filesystem is ext3.
>
> Now to the interesting part: would it make sense to use different stripe
> sizes on the separate disk arrays? In theory, a smaller stripe size (8-32K)
> should increase sequential write throughput at the cost of decreased
> positioning performance, which sounds good for WAL (assuming WAL is never
> "searched" during normal operation). And for disks holding the data, a
> larger stripe size (>32K) should provide for more concurrent (small)
> reads/writes at the cost of decreased raw throughput. This is with an OLTP
> type application in mind, so I'd rather have high transaction throughput
> than high sequential read speed. The interface is a 2Gb FC so I'm throttled
> to (theoretically) 192Mb/s, anyway.
>
> So, does this make sense? Has anyone tried it and seen any performance
> gains from it?
>
> Regards,
> Mikael.
>