Re: production server down

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: production server down
Date: 2004-12-15 03:42:50
Message-ID: 200412150342.iBF3got12800@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Joe Conway wrote:
> This is a SuSE 9, 8-way Xeon IBM x445, with nfs mounted Network
> Appliance for database storage, postgresql-7.4.5-36.4.
>
> The server experienced a hang (as yet unexplained) yesterday and was
> restarted at 2004-12-13 16:38:49 according to syslog. I'm told by the
> network admin that there was a problem with the network card on restart,
> so the nfs mount most probably disappeared and then reappeared
> underneath a quiescent postgresql at some point between 2004-12-13
> 16:39:55 and 2004-12-14 15:36:20 (but much closer to the former than the
> latter).

Well, my first reaction is that if the file system storage was not
always 100% reliable, then there is no way to know the data is correct
except by restoring from backup. The startup failure indicates that
there were surely storage problems in the past. There is no way to know
how far that corrupt goes.

You can use pg_resetxlog to clear it out and look to see how accurate it
is, but there is no way to be sure. I would back up the file system
with the server down in case you want to do some more serious recovery
attempts later though.

The Freenode IRC channel can probably walk you through more details of
the recovery process.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2004-12-15 03:52:22 libpq *.def files built for non-Win32
Previous Message Joe Conway 2004-12-15 03:11:56 production server down