Re: H/W RAID 5 on slower disks versus no raid on faster HDDs

Lists: pgsql-adminpgsql-performance
From: "Rajesh Kumar Mallah(dot)" <mallah(at)trade-india(dot)com>
To: pgsql-performance(at)postgresql(dot)org, pgsql-admin(at)postgresql(dot)org
Subject: H/W RAID 5 on slower disks versus no raid on faster HDDs
Date: 2002-11-21 16:45:02
Message-ID: 200211212215.02699.mallah@trade-india.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance


Hi folks,

I have two options:
3*18 GB 10,000 RPM Ultra160 Dual Channel SCSI controller + H/W Raid 5
and
2*36 GB 15,000 RPM Ultra320 Dual Channel SCSI and no RAID

Does anyone opinions *performance wise* the pros and cons of above
two options.

please take in consideration in latter case its higher RPM and better
SCSI interface.

Regds
Mallah.

--
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.


From: "Charles H(dot) Woloszynski" <chw(at)clearmetrix(dot)com>
To: "Rajesh Kumar Mallah(dot)" <mallah(at)trade-india(dot)com>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: H/W RAID 5 on slower disks versus no raid on faster
Date: 2002-11-21 17:06:03
Message-ID: 3DDD127B.3030003@clearmetrix.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

How are you going to make use of the three faster drives under
postgresql? Were you intending to put the WAL, system/swap, and the
actual data files on separate drives/partitions? Unless you do
something like that (or s/w RAID to distribute the processing across the
disks), you really have ONE SCSI 15K Ultra320 drive against 3 slower
drives with the RAID overhead (and spreading of performance because of
the multiple heads).

I don't have specifics here, but I'd expect that the RAID5 on slower
drives would work better for apps with lots of selects or lots of
concurrent users. I suspect that the Ultra320 would be better for batch
jobs and mostly transactions with less selects.

Charlie

Rajesh Kumar Mallah. wrote:

>Hi folks,
>
>I have two options:
>3*18 GB 10,000 RPM Ultra160 Dual Channel SCSI controller + H/W Raid 5
>and
>2*36 GB 15,000 RPM Ultra320 Dual Channel SCSI and no RAID
>
>Does anyone opinions *performance wise* the pros and cons of above
>two options.
>
>please take in consideration in latter case its higher RPM and better
>SCSI interface.
>
>
>
>Regds
>Mallah.
>
>
>
>
>
>

--

Charles H. Woloszynski

ClearMetrix, Inc.
115 Research Drive
Bethlehem, PA 18015

tel: 610-419-2210 x400
fax: 240-371-3256
web: www.clearmetrix.com


From: Chris Ruprecht <chris(at)ruprecht(dot)org>
To: "Rajesh Kumar Mallah(dot)" <mallah(at)trade-india(dot)com>
Cc: <pgsql-admin(at)postgresql(dot)org>
Subject: Re: H/W RAID 5 on slower disks versus no raid on faster HDDs
Date: 2002-11-21 17:19:35
Message-ID: 200211211219.35550.chris@ruprecht.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

raid 0 (striping) spreads the load over multiple spindels, the same way raid 5
does. but raid 5 always needs to calculate parity and write that to it's
parity drive.

RPM isn't that critical, a lot depends on the machine, the processor and the
memory (and the spped with which the processor can get to the memory). I have
recently tested a lot of systems with some database benchmarks we wrote here
at work. We're not running Postgres here at work, sorry, these benchmarks are
of no use to Postgres ...
We we found is that a lot depends on motherboard design, not so much on drive
speed. We got to stages where we allocated 1.8 GB of RAM to shared memory for
the database server process, resulting in the entire database being sucked
into memory. When doing reads, 100% of the data is coming out the that
menory, and drive speed becomes irrelevant.

From tests I did with Postgres on my boxes at home, I can say: The more shared
memory you can throw at the server process, the better. Under MacOS X I
wasn't able to allocate more than 3 MB, Under Linux, I can allocate anything
I want to, so I usually start up the server with 256 MB. The difference? A
process which takes 4 minutes under Linux, takes 6 hours under MacOS - same
hardware, same drives, different memory settings.

Best regards,
Chris

On Thursday 21 November 2002 12:02, you wrote:
> Thanks Chris,
>
> does raid0 enhances both read/write both?
>
> does rpms not matter that much?
>
> regds
> mallah.
>
> On Thursday 21 November 2002 22:27, you wrote:
> > RAID 5 gives you pretty bad performance, a slowdown of about 50%. For
> > pure performance, I'd use the 3 18 GB drives with RAID 0.
> >
> > If you need fault tolerance, you could use RAID 0+1 or 1+0 but you'd need
> > an even number of drives for that, of which half would become 'usable
> > space'.
> >
> > Best regards,
> > Chris
> >
> > On Thursday 21 November 2002 11:45, you wrote:
> > > Hi folks,
> > >
> > > I have two options:
> > > 3*18 GB 10,000 RPM Ultra160 Dual Channel SCSI controller + H/W Raid 5
> > > and
> > > 2*36 GB 15,000 RPM Ultra320 Dual Channel SCSI and no RAID
> > >
> > > Does anyone opinions *performance wise* the pros and cons of above
> > > two options.
> > >
> > > please take in consideration in latter case its higher RPM and better
> > > SCSI interface.
> > >
> > >
> > >
> > > Regds
> > > Mallah.

--
Network Grunt and Bit Pusher extraordinaire


From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: "Rajesh Kumar Mallah(dot)" <mallah(at)trade-india(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: H/W RAID 5 on slower disks versus no raid on faster
Date: 2002-11-21 17:32:05
Message-ID: Pine.LNX.4.33.0211211029390.23081-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

On Thu, 21 Nov 2002, Rajesh Kumar Mallah. wrote:

>
> Hi folks,
>
> I have two options:
> 3*18 GB 10,000 RPM Ultra160 Dual Channel SCSI controller + H/W Raid 5
> and
> 2*36 GB 15,000 RPM Ultra320 Dual Channel SCSI and no RAID
>
> Does anyone opinions *performance wise* the pros and cons of above
> two options.
>
> please take in consideration in latter case its higher RPM and better
> SCSI interface.

Does the OS you're running on support software RAID? If so the dual 36
gigs in a RAID0 software would be fastest, and in a RAID1 would still be
pretty fast plus they would be redundant.

Depending on your queries, there may not be a lot of difference between
running the 3*18 hw RAID or the 2*36 setup, especially if most of your
data can fit into memory on the server.

Generally, the 2*36 should be faster for writing, and the 3*18 should be
about even for reads, maybe a little faster.


From: "Rajesh Kumar Mallah(dot)" <mallah(at)trade-india(dot)com>
To: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: H/W RAID 5 on slower disks versus no raid on faster HDDs
Date: 2002-11-21 17:46:55
Message-ID: 200211212316.55081.mallah@trade-india.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

Oh i did not mention,
its linux, it does.

RAM: 2.0 GB
CPU: Dual 2.0 Ghz Intel Xeon DP Processors.

On Thursday 21 November 2002 23:02, scott.marlowe wrote:
> On Thu, 21 Nov 2002, Rajesh Kumar Mallah. wrote:
> > Hi folks,
> >
> > I have two options:
> > 3*18 GB 10,000 RPM Ultra160 Dual Channel SCSI controller + H/W Raid 5
> > and
> > 2*36 GB 15,000 RPM Ultra320 Dual Channel SCSI and no RAID
> >
> > Does anyone opinions *performance wise* the pros and cons of above
> > two options.
> >
> > please take in consideration in latter case its higher RPM and better
> > SCSI interface.
>
> Does the OS you're running on support software RAID? If so the dual 36
> gigs in a RAID0 software would be fastest, and in a RAID1 would still be
> pretty fast plus they would be redundant.

>
> Depending on your queries, there may not be a lot of difference between
> running the 3*18 hw RAID or the 2*36 setup, especially if most of your
> data can fit into memory on the server.
> Generally, the 2*36 should be faster for writing, and the 3*18 should be
> about even for reads, maybe a little faster.

Since i got lots of RAM and my Data Size (on disk ) is 2 GB i feel frequent reads
can happen from the memory.

I have heard putting pg_xlog in a drive of its own helps in boosting updates to
DB server.
in that case shud i forget abt the h/w and use one disk exclusively for the WAL?

Regds
mallah.

--
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.


From: "Rajesh Kumar Mallah(dot)" <mallah(at)trade-india(dot)com>
To: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: H/W RAID 5 on slower disks versus no raid on faster HDDs
Date: 2002-11-21 17:58:43
Message-ID: 200211212328.43204.mallah@trade-india.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

OK now i am reading Momjian's "PostgreSQL Hardware Performance Tuning"
once again ;-)

mallah.

On Thursday 21 November 2002 23:02, scott.marlowe wrote:
> On Thu, 21 Nov 2002, Rajesh Kumar Mallah. wrote:
> > Hi folks,
> >
> > I have two options:
> > 3*18 GB 10,000 RPM Ultra160 Dual Channel SCSI controller + H/W Raid 5
> > and
> > 2*36 GB 15,000 RPM Ultra320 Dual Channel SCSI and no RAID
> >
> > Does anyone opinions *performance wise* the pros and cons of above
> > two options.
> >
> > please take in consideration in latter case its higher RPM and better
> > SCSI interface.
>
> Does the OS you're running on support software RAID? If so the dual 36
> gigs in a RAID0 software would be fastest, and in a RAID1 would still be
> pretty fast plus they would be redundant.
>
> Depending on your queries, there may not be a lot of difference between
> running the 3*18 hw RAID or the 2*36 setup, especially if most of your
> data can fit into memory on the server.
>
> Generally, the 2*36 should be faster for writing, and the 3*18 should be
> about even for reads, maybe a little faster.

--
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.


From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: "Rajesh Kumar Mallah(dot)" <mallah(at)trade-india(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on faster HDDs
Date: 2002-11-21 18:56:29
Message-ID: 20021121185629.C6A11103C2@polaris.pinpointresearch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

I had long labored under the impression that RAID 5 should give me better
performance but I have since encountered many reports that this is not the
case. Do some searching on Google and you will probably find numerous
articles.

Note 3x18 w/RAID5 will give 36GB usable while 2x36 w/o RAID is 72GB.
You could use mirroring on the 2x36 and have the same usable space.

A mirrored 2x36 setup will probably yield a marginal hit on writes (vs a
single disk) and an improvement on reads due to having two drives to read
from and will (based on the Scientific Wild Ass Guess method and knowing
nothing about your overall system) probably be faster than the RAID5
configuration while giving you identical usable space and data safety.

You also may see improvements due to the 15,000RPM drives (of course RPM is
sort of an arbitrary measure - you really want to know about track access
times, latency, transfer rate, etc. and RPM is just one influencing factor
for the above).

The quality of your RAID cards will also be important (how fast do they
perform their calculations, how much buffer do they have) as will the overall
specs of you system. If you have a bottleneck somewhere other than your raw
disk I/O then you can throw all the money you want at faster drives and see
no improvement.

Cheers,
Steve

On Thursday 21 November 2002 8:45 am, you wrote:
> Hi folks,
>
> I have two options:
> 3*18 GB 10,000 RPM Ultra160 Dual Channel SCSI controller + H/W Raid 5
> and
> 2*36 GB 15,000 RPM Ultra320 Dual Channel SCSI and no RAID
>
> Does anyone opinions *performance wise* the pros and cons of above
> two options.
>
> please take in consideration in latter case its higher RPM and better
> SCSI interface.
>
>
>
> Regds
> Mallah.


From: "Rajesh Kumar Mallah(dot)" <mallah(at)trade-india(dot)com>
To: Steve Crawford <scrawford(at)pinpointresearch(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on faster HDDs
Date: 2002-11-21 19:08:43
Message-ID: 200211220038.43950.mallah@trade-india.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

Thanks Steve,

recently i have come to know that i can only get 3*18 GB ultra160 10K
hraddrives,

my OS is lunux , other parameters are
RAM:2GB , CPU:2*2Ghz Xeon,

i feel i will do away with raid use one disk for the OS
and pg_dumps

, one for tables and last one for WAL , does this sound good?

regds
mallah.

On Friday 22 November 2002 00:26, Steve Crawford wrote:
> I had long labored under the impression that RAID 5 should give me better
> performance but I have since encountered many reports that this is not the
> case. Do some searching on Google and you will probably find numerous
> articles.
>
> Note 3x18 w/RAID5 will give 36GB usable while 2x36 w/o RAID is 72GB.
> You could use mirroring on the 2x36 and have the same usable space.
>
> A mirrored 2x36 setup will probably yield a marginal hit on writes (vs a
> single disk) and an improvement on reads due to having two drives to read
> from and will (based on the Scientific Wild Ass Guess method and knowing
> nothing about your overall system) probably be faster than the RAID5
> configuration while giving you identical usable space and data safety.
>
> You also may see improvements due to the 15,000RPM drives (of course RPM is
> sort of an arbitrary measure - you really want to know about track access
> times, latency, transfer rate, etc. and RPM is just one influencing factor
> for the above).
>
> The quality of your RAID cards will also be important (how fast do they
> perform their calculations, how much buffer do they have) as will the
> overall specs of you system. If you have a bottleneck somewhere other than
> your raw disk I/O then you can throw all the money you want at faster
> drives and see no improvement.
>
> Cheers,
> Steve
>
> On Thursday 21 November 2002 8:45 am, you wrote:
> > Hi folks,
> >
> > I have two options:
> > 3*18 GB 10,000 RPM Ultra160 Dual Channel SCSI controller + H/W Raid 5
> > and
> > 2*36 GB 15,000 RPM Ultra320 Dual Channel SCSI and no RAID
> >
> > Does anyone opinions *performance wise* the pros and cons of above
> > two options.
> >
> > please take in consideration in latter case its higher RPM and better
> > SCSI interface.
> >
> >
> >
> > Regds
> > Mallah.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)

--
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.


From: "Bjoern Metzdorf" <bm(at)turtle-entertainment(dot)de>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on faster HDDs
Date: 2002-11-21 19:24:19
Message-ID: 008e01c29193$9795eb80$0564a8c0@toolteam.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

> A mirrored 2x36 setup will probably yield a marginal hit on writes (vs a
> single disk) and an improvement on reads due to having two drives to read
> from and will (based on the Scientific Wild Ass Guess method and knowing

slightly offtopic:

Does anyone one if linux software raid 1 supports this method (reading from
both disks, thus doubling performance)?

Regards,
Bjoern


From: eric soroos <eric-psql(at)soroos(dot)net>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on faster HDDs
Date: 2002-11-21 19:30:44
Message-ID: 109501694.1174244252@[4.42.179.151]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance


> Does anyone one if linux software raid 1 supports this method (reading from
> both disks, thus doubling performance)?
>

From memory of reading slightly old (1999) howtos, I believe that the answer is yes, at least for the md system. Not sure about LVM, or even if mirroring is supported under LVM.

I would guess that it shouldn't be too hard to test:

1) set up dataset on mirred system.
2) run pg_bench or one of the tpc benches.
3) fail one of the drives in the mirror.
4) run the test again.

If the read latency goes down, it should be reflected in the benchmark.

eric


From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: "Rajesh Kumar Mallah(dot)" <mallah(at)trade-india(dot)com>
Cc: Steve Crawford <scrawford(at)pinpointresearch(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on
Date: 2002-11-21 19:39:09
Message-ID: Pine.LNX.4.33.0211211235280.23530-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

On Fri, 22 Nov 2002, Rajesh Kumar Mallah. wrote:

>
>
> Thanks Steve,
>
> recently i have come to know that i can only get 3*18 GB ultra160 10K
> hraddrives,
>
> my OS is lunux , other parameters are
> RAM:2GB , CPU:2*2Ghz Xeon,
>
> i feel i will do away with raid use one disk for the OS
> and pg_dumps
>
> , one for tables and last one for WAL , does this sound good?

That depends. Are you going to be mostly reading, mostly updating, or an
even mix of both?

If you are going to be 95% reading, then don't bother moving WAL to
another drive, install the OS on the first 2 or 3 gigs of each drive, then
make a RAID5 out of what's left over and put everything on that.

If you're going to be mostly updating, then yes, your setup is a pretty
good choice.

If it will be mostly mixed, look at using a software RAID1.

More important will be tuning your database once it's up, i.e. increasing
shared buffers, setting random page costs to reflect what percentage of
your dataset is likely to be cached (the closer you come to caching your
whole dataset, the closer random page cost approaches 1)


From: Mike Nielsen <miken(at)bigpond(dot)net(dot)au>
To: Bjoern Metzdorf <bm(at)turtle-entertainment(dot)de>
Cc: Postgresql performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on
Date: 2002-11-21 20:03:46
Message-ID: 1037909025.21282.510.camel@CPE-144-132-182-167.nsw.bigpond.net.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

Bjoern,

You may find that hoping for a doubling of performance by using RAID 1
is a little on the optimistic side.

Except on very long sequential reads, media transfer rates are unlikely
to be the limiting factor in disk throughput. Seek and rotational
latencies are the cost factor in random I/O, and with RAID 1, the
performance gain comes from reducing the mean latency -- on a single
request, one disk will be closer to the data than the other. If the
software that's handling the RAID 1 will schedule concurrent requests,
you lose the advantage of reducing mean latency in this fashion, but you
can get some improvement in throughput by overlapping some latency
periods.

While not wanting to argue against intelligent I/O design, memory is
cheap these days, and usually gives big bang-for-buck in improving
response times.

As to the specifics of how one level or another of Linux implements RAID
1, I'm afraid I can't shed much light at the moment.

Regards,

Mike
On Fri, 2002-11-22 at 06:24, Bjoern Metzdorf wrote:

> > A mirrored 2x36 setup will probably yield a marginal hit on writes (vs a
> > single disk) and an improvement on reads due to having two drives to read
> > from and will (based on the Scientific Wild Ass Guess method and knowing
>
> slightly offtopic:
>
> Does anyone one if linux software raid 1 supports this method (reading from
> both disks, thus doubling performance)?
>
> Regards,
> Bjoern
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

Michael Nielsen

ph: 0411-097-023 email: miken(at)bigpond(dot)net(dot)au

Mike Nielsen

________________________________________________________________________


From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Bjoern Metzdorf <bm(at)turtle-entertainment(dot)de>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on
Date: 2002-11-21 20:17:05
Message-ID: Pine.LNX.4.33.0211211312240.23651-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

On Thu, 21 Nov 2002, Bjoern Metzdorf wrote:

> > A mirrored 2x36 setup will probably yield a marginal hit on writes (vs a
> > single disk) and an improvement on reads due to having two drives to read
> > from and will (based on the Scientific Wild Ass Guess method and knowing
>
> slightly offtopic:
>
> Does anyone one if linux software raid 1 supports this method (reading from
> both disks, thus doubling performance)?

Yes, it does. Generally speaking, it increases raw throughput by a factor
of 2 if you're grabbing enough data to justify reading it from both
drives. But for most database apps, you don't read enough at a time to
get a gain from this. I.e. if your stripe size is 8k and you're reading
1k at a time, no gain.

However, under parallel load, the extra drives really help.

In fact, the linux kernel supports >2 drives in a mirror. Useful for a
mostly read database that needs to handle lots of concurrent users.


From: "Bjoern Metzdorf" <bm(at)turtle-entertainment(dot)de>
To: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on
Date: 2002-11-21 20:53:02
Message-ID: 012201c2919f$fc4e1b40$0564a8c0@toolteam.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

> In fact, the linux kernel supports >2 drives in a mirror. Useful for a
> mostly read database that needs to handle lots of concurrent users.

Good to know.

What do you think is faster: 3 drives in raid 1 or 3 drives in raid 5?

Regards,
Bjoern


From: "Josh Berkus" <josh(at)agliodbs(dot)com>
To: "Bjoern Metzdorf" <bm(at)turtle-entertainment(dot)de>, "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no
Date: 2002-11-21 21:20:56
Message-ID: web-1836145@davinci.ethosmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

Bjoern,

> Good to know.
>
> What do you think is faster: 3 drives in raid 1 or 3 drives in raid
> 5?

My experience? Raid 1. But that depends on other factors as well;
your controller (software controllers use system RAM and thus lower
performance), what kind of reads you're getting and how often. IMHO,
RAID 5 is faster for sequential reads (lareg numbers of records on
clustered tables), RAID 1 for random reads.

And keep in mind: RAID 5 is *bad* for data writes. In my experience,
database data-write performance on RAID 5 UW SCSI is as slow as IDE
drives, particulary for updating large numbers of records, *unless* the
updated records are sequentially updated and clustered.

But in a multi-user write-often setup, RAID 5 will slow you down and
RAID 1 is better.

Did that help?

-Josh Berkus


From: "Josh Berkus" <josh(at)agliodbs(dot)com>
To: "Bjoern Metzdorf" <bm(at)turtle-entertainment(dot)de>, "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no
Date: 2002-11-21 21:21:16
Message-ID: web-1836149@davinci.ethosmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

Bjoern,

> Good to know.
>
> What do you think is faster: 3 drives in raid 1 or 3 drives in raid
> 5?

My experience? Raid 1. But that depends on other factors as well;
your controller (software controllers use system RAM and thus lower
performance), what kind of reads you're getting and how often. IMHO,
RAID 5 is faster for sequential reads (lareg numbers of records on
clustered indexes), RAID 1 for random reads.

And keep in mind: RAID 5 is *bad* for data writes. In my experience,
database data-write performance on RAID 5 UW SCSI is as slow as IDE
drives, particulary for updating large numbers of records, *unless* the
updated records are sequentially updated and clustered.

But in a multi-user write-often setup, RAID 5 will slow you down and
RAID 1 is better.

Did that help?

-Josh Berkus


From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Bjoern Metzdorf <bm(at)turtle-entertainment(dot)de>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on
Date: 2002-11-21 21:24:00
Message-ID: Pine.LNX.4.33.0211211420280.23775-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

On Thu, 21 Nov 2002, Bjoern Metzdorf wrote:

> > In fact, the linux kernel supports >2 drives in a mirror. Useful for a
> > mostly read database that needs to handle lots of concurrent users.
>
> Good to know.
>
> What do you think is faster: 3 drives in raid 1 or 3 drives in raid 5?

Generally RAID 5. RAID 1 is only faster if you are doing a lot of
parellel reads. I.e. you have something like 10 agents reading at the
same time. RAID 5 also works better under parallel load than a single
drive.

The fastest of course, is multidrive RAID0. But there's no redundancy.

Oddly, my testing doesn't show any appreciable performance increase in
linux by layering RAID5 or 1 over RAID0 or vice versa, something that
is usually faster under most setups.


From: "Bjoern Metzdorf" <bm(at)turtle-entertainment(dot)de>
To: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on
Date: 2002-11-21 21:57:59
Message-ID: 016901c291a9$0f23cc20$0564a8c0@toolteam.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

> Generally RAID 5. RAID 1 is only faster if you are doing a lot of
> parellel reads. I.e. you have something like 10 agents reading at the
> same time. RAID 5 also works better under parallel load than a single
> drive.

yep, but write performance sucks.

> The fastest of course, is multidrive RAID0. But there's no redundancy.

With 4 drives I'd always go for raid 10, fast and secure

> Oddly, my testing doesn't show any appreciable performance increase in
> linux by layering RAID5 or 1 over RAID0 or vice versa, something that
> is usually faster under most setups.

Is this with linux software raid? raid10 is not significantly faster? cant
believe that...

Regards,
Bjoern


From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Bjoern Metzdorf <bm(at)turtle-entertainment(dot)de>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on
Date: 2002-11-21 22:37:47
Message-ID: Pine.LNX.4.33.0211211533390.23898-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

On Thu, 21 Nov 2002, Bjoern Metzdorf wrote:

> > Generally RAID 5. RAID 1 is only faster if you are doing a lot of
> > parellel reads. I.e. you have something like 10 agents reading at the
> > same time. RAID 5 also works better under parallel load than a single
> > drive.
>
> yep, but write performance sucks.

Well, it's not all that bad. After all, you only have to read the parity
stripe and data stripe (two reads) update the data stripe, xor the new
data stripe against the old parity stripe, and write both. In RAID 1 you
have to read the old data stripe, update it, and then write it to two
drives. So, generally speaking, it's not that much more work on RAID 5
than 1. My experience has been that RAID5 is only about 10 to 20% percent
slower than RAID1 in writing, if that.

> > The fastest of course, is multidrive RAID0. But there's no redundancy.
>
> With 4 drives I'd always go for raid 10, fast and secure
>
> > Oddly, my testing doesn't show any appreciable performance increase in
> > linux by layering RAID5 or 1 over RAID0 or vice versa, something that
> > is usually faster under most setups.
>
> Is this with linux software raid? raid10 is not significantly faster? cant
> believe that...

Yep, Linux software raid. It seems like it doesn't parallelize well.
That's with several different setups. I've tested it on a machine a dual
Ultra 40/80 controller and 6 Ultra wide 10krpm SCSI drives, and no matter
how I arrange the drives, 50, 10, 01, 05, the old 1 or 5 setups are just
about as fast.


From: Mario Weilguni <mweilguni(at)sime(dot)com>
To: "Bjoern Metzdorf" <bm(at)turtle-entertainment(dot)de>, "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on
Date: 2002-11-22 07:31:11
Message-ID: 200211220831.11599.mweilguni@sime.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

Am Donnerstag, 21. November 2002 21:53 schrieb Bjoern Metzdorf:
> > In fact, the linux kernel supports >2 drives in a mirror. Useful for a
> > mostly read database that needs to handle lots of concurrent users.
>
> Good to know.
>
> What do you think is faster: 3 drives in raid 1 or 3 drives in raid 5?
>
> Regards,
> Bjoern
>

If 4 drives are an option, I suggest 2 x RAID1, one for data, and one for WAL and temporary DB space (pg_temp).

Regards,
Mario Weilguni


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mario Weilguni <mweilguni(at)sime(dot)com>
Cc: "Bjoern Metzdorf" <bm(at)turtle-entertainment(dot)de>, "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on
Date: 2002-11-22 13:52:48
Message-ID: 23482.1037973168@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

Mario Weilguni <mweilguni(at)sime(dot)com> writes:
> If 4 drives are an option, I suggest 2 x RAID1, one for data, and one for WAL and temporary DB space (pg_temp).

Ideally there should be *nothing* on the WAL drive except WAL; you don't
ever want that disk head seeking away from the WAL. Put the temp files
on the data disk.

regards, tom lane


From: "philip johnson" <philip(dot)johnson(at)atempo(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on
Date: 2002-11-22 14:17:26
Message-ID: NDBBJLHHAKJFNNCGFBHLIEFPEFAA.philip.johnson@atempo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

pgsql-performance-owner(at)postgresql(dot)org wrote:
> Objet : Re: [PERFORM] [ADMIN] H/W RAID 5 on slower disks versus no
> raid on
>
>
> Mario Weilguni <mweilguni(at)sime(dot)com> writes:
>> If 4 drives are an option, I suggest 2 x RAID1, one for data, and
>> one for WAL and temporary DB space (pg_temp).
>
> Ideally there should be *nothing* on the WAL drive except WAL; you
> don't ever want that disk head seeking away from the WAL. Put the
> temp files on the data disk.
>
> regards, tom lane
>
> ---------------------------(end of
> broadcast)--------------------------- TIP 4: Don't 'kill -9' the
> postmaster

which temp files ?


From: Andrew Sullivan <andrew(at)libertyrms(dot)info>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on
Date: 2002-11-22 15:01:52
Message-ID: 20021122100152.I27984@mail.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

On Fri, Nov 22, 2002 at 08:52:48AM -0500, Tom Lane wrote:
> Mario Weilguni <mweilguni(at)sime(dot)com> writes:
> > If 4 drives are an option, I suggest 2 x RAID1, one for data, and one for WAL and temporary DB space (pg_temp).
>
> Ideally there should be *nothing* on the WAL drive except WAL; you don't
> ever want that disk head seeking away from the WAL. Put the temp files
> on the data disk.

Unless the interface and disks are so fast that it makes no
difference.

Try as I might, I can't make WAL go any faster on its own controller
and disks than if I leave it on the same filesystem as everything
else, on our production arrays. We use Sun A5200s, and I have tried
it set up with the WAL on separate disks on the box, and on separate
disks in the array, and even on separate disks on a separate
controller in the array (I've never tried it with two arrays, but I
don't have infinite money, either). I have never managed to
demonstrate a throughput difference outside the margin of error of my
tests. One arrangement -- putting the WAL on a separate pair of UFS
disks using RAID 1, but not on the fibre channel -- was demonstrably
slower than leaving the WAL in the data area.

Nothing is proved by this, of course, except that if you're going to
use high-performance hardware, you have to tune and test over and
over again. I was truly surprised that a separate pair of VxFS
RAID-1 disks in the array were no faster, but I guess it makes sense:
the array is just as busy in either case, and the disks are really
fast. I still don't really believe it, though.

A

--
----
Andrew Sullivan 204-4141 Yonge Street
Liberty RMS Toronto, Ontario Canada
<andrew(at)libertyrms(dot)info> M2P 2A8
+1 416 646 3304 x110


From: "Merlin Moncure" <merlin(at)rcsonline(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: H/W RAID 5 on slower disks versus no raid on faster HDDs
Date: 2002-11-27 18:16:35
Message-ID: as3201$2bjg$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

On Mon, 2002-11-25 at 23:03, David Gilbert wrote:
>
> I'm on a bit of a mission to stamp out this misconception. In my
> testing, all but the most expensive hardware raid controllers are
> actually slower than FreeBSD's software RAID. I've done my tests with
> a variety of controllers with the same data load and the same disks.
>

I agree 100%: hardware raid sucks.
We had a raid 5 Postgres server on midgrade hardware with 5 60gig 7200rpm
IDE disks (240 gig total) and the thouroughput was just as high (maybe 10%
less) than a single disk. Same for the seek times. CPU around 1Ghz never
hit more than 10% for the raid service. Since very few databases are CPU
limited, this is a small price to pay.

We confirmed the performance results with heavy testing. There is virtually
no disadvatage to software raid, just spend 10$ and get a 10% faster cpu.

The linux software raid drivers (and others I assume) are very optimized.
Not to sure about m$ but win2k comes with raid services, its pretty
reasonalbe to believe they work ok.

You can double the performance of a raid system by going 0+x or x+0 (eg 05
or 50). just by adding drives. This really doubles it, and an optmized
software driver improves the seek times too by placing the idle heads it
different places on the disks.

p.s. scsi is a huge waste of money and is no faster than IDE. IMHO, scsi's
only advantage is longer cables. Both interfaces will soon be obsolete with
the coming serial ATA. High rpm disks are very expensive and add latent
heat to your system. Western digitial's IDE disks outperform even top end
scsi disks at a fraction of a cost. You can install a 4 drive 10 raid setup
for the cost of a single 15k rpm scsi disk that will absolutely blow it away
in terms of performance (reliability too).

Just remember, don't add more than one IDE disk on a raid system to a single
IDE channel! Also, do not attempt to buy IDE cables longer than 18"..they
will not be reliable.

Merlin


From: David Jericho <david(dot)jericho(at)bytecomm(dot)com(dot)au>
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: H/W RAID 5 on slower disks versus no raid on faster HDDs
Date: 2002-11-29 02:58:31
Message-ID: 20021129025831.GA12300@bytecomm.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin pgsql-performance

On Wed, Nov 27, 2002 at 01:16:35PM -0500, Merlin Moncure wrote:
> I agree 100%: hardware raid sucks.

I've been mostly ignoring this thread, but I'm going to jump in at
this point.

> We confirmed the performance results with heavy testing. There is virtually
> no disadvatage to software raid, just spend 10$ and get a 10% faster cpu.

Define heavy testing.

I can do sequential selects on a low end PC with one client and have it
perform as well as an E10K. I could also fire up 600 clients doing
seemingly random queries and updates and reduce the same low end PC to
a smoldering pile of rubble.

It'd be easy to fudge the results of the "heavy testing" to match what I
wanted to believe.

> The linux software raid drivers (and others I assume) are very optimized.

As are the Solaris drivers, and many others. But there is more to a
RAID array than drivers. There's the stability of the controller
chipsets and the latency involved in getting data to and from the
devices.

Would you feel comfortable if you knew the state data for the aircraft
you're travelling on was stored on IDE software RAID?

Part of the point of hardware raid is that it does do a small set
of operations, and therefore far easier to specify and validate the
correct operation of the software and hardware.

> p.s. scsi is a huge waste of money and is no faster than IDE. IMHO, scsi's
> only advantage is longer cables. Both interfaces will soon be obsolete with
> the coming serial ATA.

Don't get me wrong, I'm a huge fan of IDE RAID in the right locations,
but comments like this reflect a total lack of understanding what
makes SCSI a better protocol over IDE.

Disconnected operation is one _HUGE_ benefit of SCSI, simply being the
ability for the CPU and controller to send a command, and then both head
off to do another task while waiting for data to be returned from the
device. Most (that is most, not all) IDE controllers are incapable of
this. Another is command reordering (which I believe SATA does have),
being the reordering of requests to better utilise each head sweep.

This feature comes into play far more obviously when you have many
clients performing operations across a large dataset where the
elements have no immediate relationship to each other.

It is amplified when your database of such a size, and used in a way
that you have multiple controllers with multiple spools.

SCSI is not about speed to and from the device, although this does end
up being a side effect of the design. It's about latency, and removal of
contention from the shared bus.

Ultra/320 devices in reality are no faster than Ultra/160 devices.
What is faster, is that you can now have 4 devices instead of 2 on the
same bus, with lower request latency and no reduction in
throughput performance.

SCSI also allows some more advanced features too. Remote storage
over fibre, iSCSI, shared spools just to name a few.

> High rpm disks are very expensive and add latent heat to your system.

If you have a real justification for SCSI in your database server, you
probably do have both the cooling and the budget to accomodate it.

> Western digitial's IDE disks outperform even top end scsi disks at a
> fraction of a cost.

Complete and utter rubbish. That's like saying your 1.1 litre small
city commuter hatch can outperform a 600hp Mack truck.

Yes, in the general case it's quite probable. Once you start
shuffling real loads IDE will grind the machine to a halt. Real
database iron does not use normal IDE.

> You can install a 4 drive 10 raid setup for the cost of a single 15k
> rpm scsi disk that will absolutely blow it away in terms of performance

See above. Raw disk speed does not equal performance. Database spool
performance is a combination of a large number of factors, one being
seek time, and another being bus contention.

> (reliability too).

Now you're smoking crack. Having run some rather large server farms
for some very large companies, I can tell you with both anecdotal, and
recorded historical evidence that the failure rate for IDE was at
least double, if not four times that of the SCSI hardware.

And the IDE hardware was under much lower loads than the SCSI
hardware.

> Just remember, don't add more than one IDE disk on a raid system to a single
> IDE channel! Also, do not attempt to buy IDE cables longer than 18"..they
> will not be reliable.

So now you're pointing out that you share PCI bus interrupts over a large
number of devices, introducing another layer of contention and that
you'll have to cable your 20 spool machine with 20 cables each no longer
than 45cm. Throw in some very high transaction rates, and a large
data set that won't fit in your many GB of ram.

I believe the game show sound effect would be similar to "Bzzzt".

IDE for the general case is acceptable. SCSI is for everything else.

--
David Jericho
Senior Systems Administrator, Bytecomm Pty Ltd

--
Scanned and found clear of viruses by ENTIREScan. http://www.entirescan.com/