Quick Links

Re: production server down

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: production server down
Date:	2004-12-15 05:14:06
Message-ID:	41BFC81E.3050706@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>
>>I've got a down production server (will not restart) with the following
>>tail to its log file:
>
> Please show the output of pg_controldata, or a hex dump of pg_control
> if pg_controldata fails.

OK, will do shortly.

>
>>The server experienced a hang (as yet unexplained) yesterday and was
>>restarted at 2004-12-13 16:38:49 according to syslog. I'm told by the
>>network admin that there was a problem with the network card on restart,
>>so the nfs mount most probably disappeared and then reappeared
>>underneath a quiescent postgresql at some point between 2004-12-13
>>16:39:55 and 2004-12-14 15:36:20 (but much closer to the former than the
>>latter).
>
> I've always felt that running a database across NFS was a Bad Idea ;-)

Yeah, I knew I had that coming :-)

>>Any help would be much appreciated. Is our only option pg_resetxlog?
>
> Possibly, but let's try to dig first. I suppose the DB is too large
> to save an image aside for forensics later?
>

Actually, although the database is about 400 GB, we do have room and are
in the process of saving an image now.

Joe

In response to

Re: production server down at 2004-12-15 04:42:03 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Joe Conway	2004-12-15 05:22:42	Re: production server down
Previous Message	Tom Lane	2004-12-15 04:42:03	Re: production server down