database corruption questions

Lists: pgsql-general
From: Heine Ferreira <heine(dot)ferreira(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: database corruption questions
Date: 2012-10-13 21:53:03
Message-ID: CAMuqQjf9jDSV8aX-uOZKVD=GKoZD+bmPUg8-xqLTefw52dZCUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi

Are there any best practices for avoiding database
corruption? I suppose the most obvious one is
to have a ups if it's a desktop machine.
How do you detect corruption in a Postgresql
database and are there any ways to fix it besides
restoring a backup?

Thanks

H.F.


From: Leif Biberg Kristensen <leif(at)solumslekt(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: database corruption questions
Date: 2012-10-13 22:04:02
Message-ID: 201210140004.02891.leif@solumslekt.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Lørdag 13. oktober 2012 23.53.03 skrev Heine Ferreira :
> Hi
>
> Are there any best practices for avoiding database
> corruption?

In my experience, database corruption always comes down to flaky disk drives.
Keep your disks new and shiny eg. less than 3 years, and go for some kind of
redundancy in a RAID configuration.

regards, Leif


From: John R Pierce <pierce(at)hogranch(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: database corruption questions
Date: 2012-10-13 22:54:12
Message-ID: 5079F114.2020504@hogranch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 10/13/12 3:04 PM, Leif Biberg Kristensen wrote:
> Lørdag 13. oktober 2012 23.53.03 skrev Heine Ferreira :
>> >Hi
>> >
>> >Are there any best practices for avoiding database
>> >corruption?
> In my experience, database corruption always comes down to flaky disk drives.
> Keep your disks new and shiny eg. less than 3 years, and go for some kind of
> redundancy in a RAID configuration.

also, ECC RAM so creeping bit rot doesn't slip in from memory without
detection. if you use a raid controller with a write-back cache, be
sure it has BBU or flash-back.

--
john r pierce N 37, W 122
santa cruz ca mid-left coast


From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: Heine Ferreira <heine(dot)ferreira(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: database corruption questions
Date: 2012-10-14 02:13:55
Message-ID: 507A1FE3.9080305@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 10/14/2012 05:53 AM, Heine Ferreira wrote:
> Hi
>
> Are there any best practices for avoiding database
> corruption?

* Maintain rolling backups with proper ageing. For example, keep one a
day for the last 7 days, then one a week for the last 4 weeks, then one
a month for the rest of the year, then one a year.

* Use warm standby with log shipping and/or replication to maintain a
live copy of the DB.

* If you want point-in-time recovery, keep a few days or weeks worth of
WAL archives and a basebackup around. That'll help you recover from
those "oops I meant DROP TABLE unimportant; not DROP TABLE
vital_financial_records;" issues.

* Keep up to date with the latest PostgreSQL patch releases. Don't be
one of those people still running 9.0.0 when 9.0.10 is out.

* Plug-pull test your system when you're testing it before going live.
Put it under load with something like pgbench, then literally pull the
plug out. If your database doesn't come back up fine you have hardware,
OS or configuration problems.

* Don't `kill -9` the postmaster. It should be fine, but it's still not
smart.

* ABSOLUTELY NEVER DELETE postmaster.pid

* Use good quality hardware with proper cooling and a good quality power
supply. If possible, ECC RAM is a nice extra.

* Never, ever, ever use cheap SSDs. Use good quality hard drives or
(after proper testing) high end SSDs. Read the SSD reviews periodically
posted on this mailing list if considering using SSDs. Make sure the SSD
has a supercapacitor or other reliable option for flushing its write
cache on power loss. Always do repeated plug-pull testing when using SSDs.

* Use a solid, reliable file system. zfs-on-linux, btrfs, etc are not
the right choices for a database you care about. Never, ever, ever use
FAT32.

* If on Windows, do not run an anti-virus program on your
database server. Nobody should be using it for other things or running
programs on it anyway.

* Avoid RAID 5, mostly because the performance is terrible, but also
because I've seen corruption issues with rebuilds from parity on failing
disks.

* Use a good quality hardware RAID controller with a battery backup
cache unit if you're using spinning disks in RAID. This is as much for
performance as reliability; a BBU will make an immense difference to
database performance.

* If you're going to have a UPS (you shouldn't need one as your system
should be crash-safe), don't waste your money on a cheap one. Get a good
online double-conversion unit that does proper power filtering. Cheap
UPSs are just a battery with a fast switch, they provide no power
filtering and what little surge protection they offer is done with a
component that wears out after absorbing a few surges, becoming totally
ineffective. Since your system should be crash-safe a cheap UPS will do
nothing for corruption protection, it'll only help with uptime.

--
Craig Ringer


From: John R Pierce <pierce(at)hogranch(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: database corruption questions
Date: 2012-10-14 03:00:36
Message-ID: 507A2AD4.9060002@hogranch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 10/13/12 7:13 PM, Craig Ringer wrote:
>
> * Use a good quality hardware RAID controller with a battery backup
> cache unit if you're using spinning disks in RAID. This is as much for
> performance as reliability; a BBU will make an immense difference to
> database performance.

a comment on this one.... I have some test servers with lots of SAS
and/or SATA drives on controllers like LSI Logic 9261-8i, with 512MB or
1GB battery-backed cache. I can configure the controller for JBOD
and use linux mdraid raid10 and get the same performance as the
controllers native raid10, as long as the write-back cache is
enabled. disable the writeback cache, and you might as well be using
SATA JBOD.

--
john r pierce N 37, W 122
santa cruz ca mid-left coast


From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: John R Pierce <pierce(at)hogranch(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: database corruption questions
Date: 2012-10-14 03:26:40
Message-ID: 507A30F0.3040908@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 10/14/2012 11:00 AM, John R Pierce wrote:
> On 10/13/12 7:13 PM, Craig Ringer wrote:
>>
>> * Use a good quality hardware RAID controller with a battery backup
>> cache unit if you're using spinning disks in RAID. This is as much for
>> performance as reliability; a BBU will make an immense difference to
>> database performance.
>
> a comment on this one.... I have some test servers with lots of SAS
> and/or SATA drives on controllers like LSI Logic 9261-8i, with 512MB or
> 1GB battery-backed cache. I can configure the controller for JBOD
> and use linux mdraid raid10 and get the same performance as the
> controllers native raid10, as long as the write-back cache is
> enabled. disable the writeback cache, and you might as well be using
> SATA JBOD.

Yeah, without the write-back cache you don't gain much. I run a couple
of DBs on plain old `md` RAID and I'm actually quite happy with it.

I've expanded this into a blog post and improved that section there.

http://blog.ringerc.id.au/2012/10/avoiding-postgresql-database-corruption.html

Comments appreciated.

--
Craig Ringer


From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: Heine Ferreira <heine(dot)ferreira(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: database corruption questions
Date: 2012-10-14 03:28:02
Message-ID: 507A3142.2070407@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 10/14/2012 05:53 AM, Heine Ferreira wrote:
> Hi
>
> Are there any best practices for avoiding database
> corruption?

I forgot to mention, you should also read:

http://www.postgresql.org/docs/current/static/wal-reliability.html

--
Craig Ringer


From: Chris Angelico <rosuav(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: database corruption questions
Date: 2012-10-14 04:02:06
Message-ID: CAPTjJmoF_iakBwq+maDQs+3N97OjMzUqJ3=r74f5YYZG9m2V_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Sun, Oct 14, 2012 at 1:13 PM, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au> wrote:
> * Never, ever, ever use cheap SSDs. Use good quality hard drives or (after
> proper testing) high end SSDs. Read the SSD reviews periodically posted on
> this mailing list if considering using SSDs. Make sure the SSD has a
> supercapacitor or other reliable option for flushing its write cache on
> power loss. Always do repeated plug-pull testing when using SSDs.

Interesting. My boss just bought a set of SSDs for some test systems,
with the intention of using them for our next deployment. They're
giving really great performance under pgbench, but we haven't yet done
a plug-pull test on any of them. I'll make sure I do that next week.

Is there an article somewhere about how best to do a plug-pull test?
Or is it as simple as "fire up pgbench, kill the power, bring things
back up, and see if anything isn't working"?

ChrisA


From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: Chris Angelico <rosuav(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: database corruption questions
Date: 2012-10-14 04:29:31
Message-ID: 507A3FAB.1080402@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 10/14/2012 12:02 PM, Chris Angelico wrote:
>
> Is there an article somewhere about how best to do a plug-pull test?
> Or is it as simple as "fire up pgbench, kill the power, bring things
> back up, and see if anything isn't working"?

That's what I'd do and what I've always done in the past, but others
here are much more experienced with testing gear into production.

You can also use pg_test_fsync and diskchecker.pl . See:

http://www.postgresql.org/docs/current/static/wal-reliability.html

I do repeated plug-pull tests and make sure fsync is being honoured.

--
Craig Ringer


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
Cc: John R Pierce <pierce(at)hogranch(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: database corruption questions
Date: 2012-10-16 00:19:28
Message-ID: 20121016001928.GB7480@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Sun, Oct 14, 2012 at 11:26:40AM +0800, Craig Ringer wrote:
> On 10/14/2012 11:00 AM, John R Pierce wrote:
> >On 10/13/12 7:13 PM, Craig Ringer wrote:
> >>
> >>* Use a good quality hardware RAID controller with a battery backup
> >>cache unit if you're using spinning disks in RAID. This is as much for
> >>performance as reliability; a BBU will make an immense difference to
> >>database performance.
> >
> >a comment on this one.... I have some test servers with lots of SAS
> >and/or SATA drives on controllers like LSI Logic 9261-8i, with 512MB or
> >1GB battery-backed cache. I can configure the controller for JBOD
> >and use linux mdraid raid10 and get the same performance as the
> >controllers native raid10, as long as the write-back cache is
> >enabled. disable the writeback cache, and you might as well be using
> >SATA JBOD.
>
> Yeah, without the write-back cache you don't gain much. I run a
> couple of DBs on plain old `md` RAID and I'm actually quite happy
> with it.
>
> I've expanded this into a blog post and improved that section there.
>
> http://blog.ringerc.id.au/2012/10/avoiding-postgresql-database-corruption.html

Craig, that is a great post. Can you get it on Planet Postgres?

http://planet.postgresql.org/

I think you would have to subscribe your RSS blog feed.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Daniel Serodio <dserodio(at)mandic(dot)com(dot)br>
To: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
Cc: Heine Ferreira <heine(dot)ferreira(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: database corruption questions
Date: 2012-10-17 17:06:31
Message-ID: 507EE597.5060801@mandic.com.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Craig Ringer wrote:
> On 10/14/2012 05:53 AM, Heine Ferreira wrote:
>> Hi
>>
>> Are there any best practices for avoiding database
>> corruption?
>
> * Maintain rolling backups with proper ageing. For example, keep one a
> day for the last 7 days, then one a week for the last 4 weeks, then
> one a month for the rest of the year, then one a year.
What kind of "rolling backups"? From pg_basebackup?
>
> * Use warm standby with log shipping and/or replication to maintain a
> live copy of the DB.
>
> * If you want point-in-time recovery, keep a few days or weeks worth
> of WAL archives and a basebackup around. That'll help you recover from
> those "oops I meant DROP TABLE unimportant; not DROP TABLE
> vital_financial_records;" issues.
>
> * Keep up to date with the latest PostgreSQL patch releases. Don't be
> one of those people still running 9.0.0 when 9.0.10 is out.
The problem is that updating the database usually results in downtime.
Or can the downtime be avoided in a replication scenario?
>
> * Plug-pull test your system when you're testing it before going live.
> Put it under load with something like pgbench, then literally pull the
> plug out. If your database doesn't come back up fine you have
> hardware, OS or configuration problems.
>
> * Don't `kill -9` the postmaster. It should be fine, but it's still
> not smart.
>
> * ABSOLUTELY NEVER DELETE postmaster.pid
>
> * Use good quality hardware with proper cooling and a good quality
> power supply. If possible, ECC RAM is a nice extra.
>
> * Never, ever, ever use cheap SSDs. Use good quality hard drives or
> (after proper testing) high end SSDs. Read the SSD reviews
> periodically posted on this mailing list if considering using SSDs.
> Make sure the SSD has a supercapacitor or other reliable option for
> flushing its write cache on power loss. Always do repeated plug-pull
> testing when using SSDs.
>
> * Use a solid, reliable file system. zfs-on-linux, btrfs, etc are not
> the right choices for a database you care about. Never, ever, ever use
> FAT32.
>
> * If on Windows, do not run an anti-virus program on your
> database server. Nobody should be using it for other things or running
> programs on it anyway.
>
> * Avoid RAID 5, mostly because the performance is terrible, but also
> because I've seen corruption issues with rebuilds from parity on
> failing disks.
>
> * Use a good quality hardware RAID controller with a battery backup
> cache unit if you're using spinning disks in RAID. This is as much for
> performance as reliability; a BBU will make an immense difference to
> database performance.
>
> * If you're going to have a UPS (you shouldn't need one as your system
> should be crash-safe), don't waste your money on a cheap one. Get a
> good online double-conversion unit that does proper power filtering.
> Cheap UPSs are just a battery with a fast switch, they provide no
> power filtering and what little surge protection they offer is done
> with a component that wears out after absorbing a few surges, becoming
> totally ineffective. Since your system should be crash-safe a cheap
> UPS will do nothing for corruption protection, it'll only help with
> uptime.
>
> --
> Craig Ringer
>
Thanks,
Daniel Serodio


From: "Daniel Serodio (lists)" <daniel(dot)lists(at)mandic(dot)com(dot)br>
To: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
Cc: Heine Ferreira <heine(dot)ferreira(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: database corruption questions
Date: 2012-10-17 17:09:38
Message-ID: 507EE652.4000101@mandic.com.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Craig Ringer wrote:
> On 10/14/2012 05:53 AM, Heine Ferreira wrote:
>> Hi
>>
>> Are there any best practices for avoiding database
>> corruption?
>
> * Maintain rolling backups with proper ageing. For example, keep one a
> day for the last 7 days, then one a week for the last 4 weeks, then
> one a month for the rest of the year, then one a year.
What kind of "rolling backups"? From pg_basebackup?
>
> * Use warm standby with log shipping and/or replication to maintain a
> live copy of the DB.
>
> * If you want point-in-time recovery, keep a few days or weeks worth
> of WAL archives and a basebackup around. That'll help you recover from
> those "oops I meant DROP TABLE unimportant; not DROP TABLE
> vital_financial_records;" issues.
>
> * Keep up to date with the latest PostgreSQL patch releases. Don't be
> one of those people still running 9.0.0 when 9.0.10 is out.
The problem is that updating the database usually results in downtime.
Or can the downtime be avoided in a replication scenario?
>
> * Plug-pull test your system when you're testing it before going live.
> Put it under load with something like pgbench, then literally pull the
> plug out. If your database doesn't come back up fine you have
> hardware, OS or configuration problems.
>
> * Don't `kill -9` the postmaster. It should be fine, but it's still
> not smart.
>
> * ABSOLUTELY NEVER DELETE postmaster.pid
>
> * Use good quality hardware with proper cooling and a good quality
> power supply. If possible, ECC RAM is a nice extra.
>
> * Never, ever, ever use cheap SSDs. Use good quality hard drives or
> (after proper testing) high end SSDs. Read the SSD reviews
> periodically posted on this mailing list if considering using SSDs.
> Make sure the SSD has a supercapacitor or other reliable option for
> flushing its write cache on power loss. Always do repeated plug-pull
> testing when using SSDs.
>
> * Use a solid, reliable file system. zfs-on-linux, btrfs, etc are not
> the right choices for a database you care about. Never, ever, ever use
> FAT32.
>
> * If on Windows, do not run an anti-virus program on your
> database server. Nobody should be using it for other things or running
> programs on it anyway.
>
> * Avoid RAID 5, mostly because the performance is terrible, but also
> because I've seen corruption issues with rebuilds from parity on
> failing disks.
>
> * Use a good quality hardware RAID controller with a battery backup
> cache unit if you're using spinning disks in RAID. This is as much for
> performance as reliability; a BBU will make an immense difference to
> database performance.
>
> * If you're going to have a UPS (you shouldn't need one as your system
> should be crash-safe), don't waste your money on a cheap one. Get a
> good online double-conversion unit that does proper power filtering.
> Cheap UPSs are just a battery with a fast switch, they provide no
> power filtering and what little surge protection they offer is done
> with a component that wears out after absorbing a few surges, becoming
> totally ineffective. Since your system should be crash-safe a cheap
> UPS will do nothing for corruption protection, it'll only help with
> uptime.
>
> --
> Craig Ringer
>
Thanks,
Daniel Serodio


From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: Daniel Serodio <dserodio(at)mandic(dot)com(dot)br>
Cc: Heine Ferreira <heine(dot)ferreira(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: database corruption questions
Date: 2012-10-18 00:32:31
Message-ID: 507F4E1F.5000105@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 10/18/2012 01:06 AM, Daniel Serodio wrote:
> Craig Ringer wrote:
>> On 10/14/2012 05:53 AM, Heine Ferreira wrote:
>>> Hi
>>>
>>> Are there any best practices for avoiding database
>>> corruption?
>>
>> * Maintain rolling backups with proper ageing. For example, keep one a
>> day for the last 7 days, then one a week for the last 4 weeks, then
>> one a month for the rest of the year, then one a year.
> What kind of "rolling backups"? From pg_basebackup?

I'd recommend good old `pg_dump`, that way you're not assuming that your
cluster's on-disk format is intact and happy. Regular dumps will also
help detect any damage that might've crept in from file system
corruption, HDD/RAID faults, etc. Not that that should happen, but we're
talking preventative action here.

I elaborated somewhat here:


http://blog.ringerc.id.au/2012/10/avoiding-postgresql-database-corruption.html

Alternately, check out barman: http://www.pgbarman.org/ . I haven't
tried it yet, but it looks very promising. I'd still want to take
periodic dumps, as I'm reluctant to rely on `pg_basebackup` style
cluster copies alone.

>> * Keep up to date with the latest PostgreSQL patch releases. Don't be one of those people still running 9.0.0 when 9.0.10 is out.
> The problem is that updating the database usually results in downtime. Or can the downtime be avoided in a replication scenario?

Patches within the same minor release require extremely minimal downtime.

- Stop server
- Install new binaries
- Start server

How much downtime will a data corruption bug (yes, they've happened)
that's fixed in a new version cost you if you don't patch and it bites
you? Or a bug that causes a server crash and restart? Plan downtime, so
you don't have potentially much longer unplanned downtime at the worst
possible time.

You can do zero-downtime minor updates using hot standby and standby
promotion; see http://www.repmgr.org/ .

Updating to a new major release is a bigger job, but that's not what I'm
talking about.

BTW, please trim your replies to quote just the relevant context.

--
Craig Ringer