Re: production server down

From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: production server down
Date: 2004-12-15 05:14:06
Message-ID: 41BFC81E.3050706@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>
>>I've got a down production server (will not restart) with the following
>>tail to its log file:
>
> Please show the output of pg_controldata, or a hex dump of pg_control
> if pg_controldata fails.

OK, will do shortly.

>
>>The server experienced a hang (as yet unexplained) yesterday and was
>>restarted at 2004-12-13 16:38:49 according to syslog. I'm told by the
>>network admin that there was a problem with the network card on restart,
>>so the nfs mount most probably disappeared and then reappeared
>>underneath a quiescent postgresql at some point between 2004-12-13
>>16:39:55 and 2004-12-14 15:36:20 (but much closer to the former than the
>>latter).
>
> I've always felt that running a database across NFS was a Bad Idea ;-)

Yeah, I knew I had that coming :-)

>>Any help would be much appreciated. Is our only option pg_resetxlog?
>
> Possibly, but let's try to dig first. I suppose the DB is too large
> to save an image aside for forensics later?
>

Actually, although the database is about 400 GB, we do have room and are
in the process of saving an image now.

Joe

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2004-12-15 05:22:42 Re: production server down
Previous Message Tom Lane 2004-12-15 04:42:03 Re: production server down