Re: Recommendations for SSDs in production?

From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Kurt Buff <kurt(dot)buff(at)gmail(dot)com>
Cc: Benjamin Smith <lists(at)benjamindsmith(dot)com>, Robert Treat <rob(at)xzilla(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: Recommendations for SSDs in production?
Date: 2011-11-04 14:26:32
Message-ID: 4EB3F618.30504@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 2011-11-04 04:21, Kurt Buff wrote:
> Oddly enough, Tom's Hardware has a review of the Intel offering today
> - might be worth your while to take a look at it. Kurt

Thanks for that link! Seeing media wearout comparisons between 'consumer
grade' and 'enterprise' disks was enough for me to stop thinking about
the vertex 3 and intel 510 behind hardware raid: I'm going to stick with
Intel 710 and Vertex 2 Pro on onboard SATA.

Tom's Hardware also showed how to test wearout using the workload
indicator, so I thought lets do that with a pgbench workload.

First, if your'e interested in doing a test like this yourself, I'm
testing on ubuntu 11.10, but even though this is a brand new
distribution, the smart database was a few months old.
'update-smart-drivedb' had as effect that the names of the values turned
into something useful: instead of #LBA's written, it now shows #32MiB's
written. Also there are now three 'workload' related parameters.

225 Host_Writes_32MiB 0x0032 100 100 000 Old_age
Always - 108551
226 Workld_Media_Wear_Indic 0x0032 100 100 000 Old_age
Always - 17
227 Workld_Host_Reads_Perc 0x0032 100 100 000 Old_age
Always - 0
228 Workload_Minutes 0x0032 100 100 000 Old_age
Always - 211
232 Available_Reservd_Space 0x0033 100 100 010 Pre-fail
Always - 0
233 Media_Wearout_Indicator 0x0032 100 100 000 Old_age
Always - 0
241 Host_Writes_32MiB 0x0032 100 100 000 Old_age
Always - 108551
242 Host_Reads_32MiB 0x0032 100 100 000 Old_age
Always - 21510

Tom's hardware on page
http://www.tomshardware.com/reviews/ssd-710-enterprise-x25-e,3038-4.html
shows how to turn these numbers into useful values.

The numbers above were taken 211 minutes after I cleared the workload
values with smartctl -t vendor,0x40 /dev/sda. If you do that, the
workload values become 0, then after a few minutes they all become 65535
and not before 60 minutes of testing you'll see some useful values returned.

During the test, I did two one hour pgbench runs on a md raid1 with the
intel 710 and vertex 2 pro, wal in ram.
pgbench -i -s 300 t (fits in ram)
pgbench -j 20 -c 20 -M prepared -T 3600 -l t (two times)

% mediawear by workload is Workld_Media_Wear_Indic / 1024
17/1024 = .0166015625 %

Lets turn this into # days. I take the most pessimistic number of 120
minutes of actual pgbench testing, instead of the total minutes since
workload reset of 211 minutes.
120/(17/1024/100)/60/24 = 501.9608599031 days

The Host_Reads_32MiB value was 91099 before the test, now it is at 108551.
(108551-91099)*32/1024 = 545 GB written during the test.

(108551-91099)*32/1024/1024/(17/1024/100) = 3208 TB before media wearout.

This number fits between Tom's hardware's calculated wearout numbers,
7268 TB for sequential and 1437 TB for random load.

-- Yeb

PS: info on test setup
Model Number: INTEL SSDSA2BZ100G3 Firmware Revision: 6PB10362
Model Number: OCZ-VERTEX2 PRO Firmware Revision: 1.35

partitions aligned on 512kB boundary.
workload on ~20GB software raid mirror (drives are 100GB).

Linux client46 3.0.0-12-generic #20-Ubuntu SMP Fri Oct 7 14:56:25 UTC
2011 x86_64 x86_64 x86_64 GNU/Linux
PostgreSQL 9.2devel on x86_64-unknown-linux-gnu, compiled by gcc
(Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit

/proc/sys/vm/dirty_background_bytes set to 178500000

non standard parameters of pg are:
maintenance_work_mem = 1GB # pgtune wizard 2011-10-28
checkpoint_completion_target = 0.9 # pgtune wizard 2011-10-28
effective_cache_size = 16GB # pgtune wizard 2011-10-28
work_mem = 80MB # pgtune wizard 2011-10-28
wal_buffers = 8MB # pgtune wizard 2011-10-28
checkpoint_segments = 96
shared_buffers = 5632MB # pgtune wizard 2011-10-28
max_connections = 300 # pgtune wizard 2011-10-28

Latency and tps graphs of *one* of the 20 clients during the second
pgbench test are here: http://imgur.com/a/jjl13 - note that max latency
has dropped from ~ 3 seconds from earlier tests to ~ 1 second - this is
mainly due to an increase of checkpoint segments from 16 to 96.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sean Patronis 2011-11-04 14:50:00 Streaming Replication woes
Previous Message Samba 2011-11-04 13:58:00 Re: equivalent to "replication_timeout" on standby server