Re: 8x2.5" or 6x3.5" disks

Lists: pgsql-performance
From: "Christian Nicolaisen" <blackbrrd(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: 8x2.5" or 6x3.5" disks
Date: 2008-01-28 19:25:59
Message-ID: 285fb20801281125k641902feo8d74b1c68640d94b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi

We are looking to buy a new server and I am wondering what kind of hardware
we should buy and how to configure it.

We are either getting 8x 2.5" 15k rpm sas disks or 6x 3.5" 15k rpm sas
disks.
If we go with the 2.5" disks I think we should run 6 disks in raid 10 for
the database and 2 disks in raid 1 for os/wal
If we go with the 3.5" disks I think we should run 4 disks in raid 10 for
the database and 2 disks in raid 1 for os/wal

The database has currently about 40gb data and 20gb indexes. The work
consists mainly of small transactions, with
the occasional report. Currently we have 10gb ram, which is enough to hold
the working set in memory 99% of the time.
The database server will only be running postgres, nothing else.

So, my question is: should I go for the 2.5" disk setup or 3.5" disk setup,
and does the raid setup in either case look correct?


From: Arjen van der Meijden <acmmailing(at)tweakers(dot)net>
To: Christian Nicolaisen <blackbrrd(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: 8x2.5" or 6x3.5" disks
Date: 2008-01-28 20:28:21
Message-ID: 479E3AE5.20600@tweakers.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On 28-1-2008 20:25 Christian Nicolaisen wrote:
> So, my question is: should I go for the 2.5" disk setup or 3.5" disk
> setup, and does the raid setup in either case look correct?

Afaik they are about equal in speed. With the smaller ones being a bit
faster in random access and the larger ones a bit faster for sequential
reads/writes.

My guess is that the 8x 2.5" configuration will be faster than the 6x
3.5", even if the 3.5"-drives happen to be faster they probably aren't
50% faster... So since you don't need the larger storage capacities that
3.5" offer, I'd go with the 8x 2.5"-setup.

Best regards,

Arjen


From: "Claus Guttesen" <kometen(at)gmail(dot)com>
To: david(at)lang(dot)hm
Cc: "Arjen van der Meijden" <acmmailing(at)tweakers(dot)net>, "Christian Nicolaisen" <blackbrrd(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: 8x2.5" or 6x3.5" disks
Date: 2008-01-29 08:06:49
Message-ID: b41c75520801290006y64b003fah6defc71f8fb05ac9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

> I missed the initial post in this thread, but I haven't seen any 15K rpm
> 2.5" drives, so if you compare 10K rpm 2.5" drives with 15K rpm 3.5"
> drives you will see differences (depending on your workload and controller
> cache)

I have some 15K rpm 2.5" sas-drives from HP. Other vendors have them as well.

--
regards
Claus

When lenity and cruelty play for a kingdom,
the gentlest gamester is the soonest winner.

Shakespeare


From: david(at)lang(dot)hm
To: Arjen van der Meijden <acmmailing(at)tweakers(dot)net>
Cc: Christian Nicolaisen <blackbrrd(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: 8x2.5" or 6x3.5" disks
Date: 2008-01-29 08:32:25
Message-ID: alpine.DEB.1.00.0801290030390.16707@asgard.lang.hm
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Mon, 28 Jan 2008, Arjen van der Meijden wrote:

> On 28-1-2008 20:25 Christian Nicolaisen wrote:
>> So, my question is: should I go for the 2.5" disk setup or 3.5" disk setup,
>> and does the raid setup in either case look correct?
>
> Afaik they are about equal in speed. With the smaller ones being a bit faster
> in random access and the larger ones a bit faster for sequential
> reads/writes.

I missed the initial post in this thread, but I haven't seen any 15K rpm
2.5" drives, so if you compare 10K rpm 2.5" drives with 15K rpm 3.5"
drives you will see differences (depending on your workload and controller
cache)

David Lang


From: Arjen van der Meijden <acmmailing(at)tweakers(dot)net>
To: david(at)lang(dot)hm
Cc: Christian Nicolaisen <blackbrrd(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: 8x2.5" or 6x3.5" disks
Date: 2008-01-29 10:29:23
Message-ID: 479F0003.6080003@tweakers.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

There are several suppliers who offer Seagate's 2.5" 15k rpm disks, I
know HP, Dell are amongst those. So I was actually refering to those,
rather than to the 10k one's.

Best regards,

Arjen

david(at)lang(dot)hm wrote:
> On Mon, 28 Jan 2008, Arjen van der Meijden wrote:
>
>> On 28-1-2008 20:25 Christian Nicolaisen wrote:
>>> So, my question is: should I go for the 2.5" disk setup or 3.5" disk
>>> setup, and does the raid setup in either case look correct?
>>
>> Afaik they are about equal in speed. With the smaller ones being a bit
>> faster in random access and the larger ones a bit faster for
>> sequential reads/writes.
>
> I missed the initial post in this thread, but I haven't seen any 15K rpm
> 2.5" drives, so if you compare 10K rpm 2.5" drives with 15K rpm 3.5"
> drives you will see differences (depending on your workload and
> controller cache)
>
> David Lang
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
>


From: "Mike Smith" <mike(dot)smith(at)enterprisedb(dot)com>
To: "Christian Nicolaisen" <blackbrrd(at)gmail(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: 8x2.5" or 6x3.5" disks
Date: 2008-01-29 11:43:15
Message-ID: 51494DB187D98F4C88DBEBF1F5F6D42303EC56C9@edb06.mail01.enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

You don't mention the capacity of the disks you are looking at. Here is
something you might want to consider.

I've seen a few performance posts on using different hardware
technologies to gain improvements. Most of those comments are on raid,
interface and rotation speed. One area that doesn't seem to have been
mentioned is to run your disks empty.

One of the key roadblocks in disk performance is the time for the disk
heads to seek, settle and find the start of the data. Another is the
time to transfer from disk to interface. Everyone may instinctively
know this but its often ignored.

Hard disks are CRV ( constant rotational velocity) = they spin at the
same speed all the time

Hard disk drives use a technology called ZBR = Zone Bit Recording = a
lot more data on the outside tracks than the inner ones.

Hard disk fill up from outside track to inside track generally unless
you've done some weird partitioning.

On the outside of the disk you get a lot more data per seek than on the
inside. Double whammy you get it faster.

Performance can vary more than 100% between the outer and inner tracks
of the disk. So running a slower disk twice as big may give you more
benefit than running a small capacity 15K disk full. The slower disks
are also generally more reliable and mostly much cheaper.

The other issue for full disks especially with lots of random small
transactions is the heads are seeking and settling across the whole
disk but typically with most of those seeks being on the latest
transactions which are placed nicely towards the middle of the disk.

I know of a major bank that has a rule of thumb 25% of the disk
partioned as a target maximum for high performance disk systems in a
key application. They also only pay for used capacity from their disk
vendor.

This is not very green as you need to buy more disks for the same amount
of data and its liable to upset your purchasing department who won't
understand why you don't want to fill your disks up.

Mike


From: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
To: "Mike Smith" <mike(dot)smith(at)enterprisedb(dot)com>
Cc: "Christian Nicolaisen" <blackbrrd(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: 8x2.5" or 6x3.5" disks
Date: 2008-01-29 15:00:22
Message-ID: dcc563d10801290700l5138d13fvb9fa05b66ddc32f2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Jan 29, 2008 5:43 AM, Mike Smith <mike(dot)smith(at)enterprisedb(dot)com> wrote:
>
> You don't mention the capacity of the disks you are looking at. Here is
> something you might want to consider.
>
> I've seen a few performance posts on using different hardware technologies
> to gain improvements. Most of those comments are on raid, interface and
> rotation speed. One area that doesn't seem to have been mentioned is to
> run your disks empty.
>
> One of the key roadblocks in disk performance is the time for the disk heads
> to seek, settle and find the start of the data. Another is the time to
> transfer from disk to interface. Everyone may instinctively know this but
> its often ignored.
>
> Hard disks are CRV ( constant rotational velocity) = they spin at the same
> speed all the time
>
> Hard disk drives use a technology called ZBR = Zone Bit Recording = a lot
> more data on the outside tracks than the inner ones.
>
> Hard disk fill up from outside track to inside track generally unless
> you've done some weird partitioning.

This really depends on your file system. While NTFS does this, ext2/3
certainly does not. Many unix file systems use a more random method
to distribute their writes.

The rest of what you describe is called "short stroking" in most
circles. It's certainly worth looking into no matter what size drives
you're using.


From: Craig James <craig_james(at)emolecules(dot)com>
To: Mike Smith <mike(dot)smith(at)enterprisedb(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: 8x2.5" or 6x3.5" disks
Date: 2008-01-29 15:06:23
Message-ID: 479F40EF.6070802@emolecules.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Mike Smith wrote:
> I’ve seen a few performance posts on using different hardware
> technologies to gain improvements. Most of those comments are on raid,
> interface and rotation speed. One area that doesn’t seem to have
> been mentioned is to run your disks empty.
> ...
> On the outside of the disk you get a lot more data per seek than on the
> inside. Double whammy you get it faster.
>
> Performance can vary more than 100% between the outer and inner tracks
> of the disk. So running a slower disk twice as big may give you more
> benefit than running a small capacity 15K disk full. The slower disks
> are also generally more reliable and mostly much cheaper.
> ...
> This is not very green as you need to buy more disks for the same amount
> of data and its liable to upset your purchasing department who won’t
> understand why you don’t want to fill your disks up.

So presumably the empty-disk effect could also be achieved by partitioning, say 25% of the drive for the database, and 75% empty partition. But in fact, you could use that "low performance 75%" for rarely-used or static data, such as the output from pg_dump, that is written during non-peak times.

Pretty cool.

Craig


From: "Mike Smith" <mike(dot)smith(at)enterprisedb(dot)com>
To: "Craig James" <craig_james(at)emolecules(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: 8x2.5" or 6x3.5" disks
Date: 2008-01-29 18:40:25
Message-ID: 51494DB187D98F4C88DBEBF1F5F6D4230257D9A6@edb06.mail01.enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

[presumably the empty-disk effect could also be achieved by partitioning, say 25% of the drive for the database, and 75% empty partition. But in fact, you could use that "low performance 75%" for rarely-used or static data, such as the output from pg_dump, that is written during non-peak times]

Larry Ellison financed a company called Pillar Data Systems which was founded on the principle that you can tier the disk according to the value and performance requirements of the data. They planned to put the most valuable in performance terms on the outside of SATA disks and use the empty space in the middle for slower stuff..
(This is not an advert. I like the idea but I dont know if it works well and I dont have anything to do with Pillar other than EnterpriseDB compete against Larry's other little company).
Probably the way to go is flash drives for primary performance data . EMC and others have announced Enterprise Flash Drives (they claim 30 times performance of 15K disks although at 30 times the cost of standard disk today ). Flash should also have pretty much consistent high performance across the whole capacity.
Within a couple of years EFD should be affordable for mainstream use.