Re: How to recover when can't start database

From: "L(dot)Boldareva" <pg(at)pierro(dot)dds(dot)nl>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: How to recover when can't start database
Date: 2005-04-01 15:57:55
Message-ID: Pine.LNX.4.58.0504011732040.19842@yafa.dds.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

I have kind of fixed the problem (hopefully)

I turned out I used plan B, dump/reload will be my next step then.
What is meant by the "corrupt data", is this about the data or the things
around it, like indexes, system tables?

Here is how I got it crashed:
I compiled a c-procedure and copied the .so file to its place exactly at a
time when (quite unfortunately) another query was running, that used that
library.

I usually do pg_ctl reload right after that, and it seems to be enough,
but not this time.

The c-function contained code that would send a query to populate a table
(most likely one of the 2 bad ones), but I am not sure this matters since
there were 2 tables out of order, and
only one at a time is touched by my script.

That's it. I just checked, if that matters, that fsync = true on the
config file.

On Fri, 1 Apr 2005, Tom Lane wrote:

> Hmm. AFAICS that could only happen if a page split record is pointing
> at an "original" page that's not there anymore; that is, the page is
> past what the kernel says is the end of the file.

Something like that was mentioned in the WARNING message when I tried to
drop the table, but I got that warning only once, further actions just
raised the error about relid.

Thank you for your help,
LB

> You could probably get it to start by changing the "false" to "true"
> in this call of XLogReadBuffer
>
> /* Left (original) sibling */
> buffer = XLogReadBuffer(false, reln, leftsib);
> if (!BufferIsValid(buffer))
> elog(PANIC, "btree_split_%s: lost left sibling", op);
>
> in src/backend/access/nbtree/nbtxlog.c (it's line 261 in CVS tip,
> possibly a little different in 8.0). Let us know if that helps.
>
> I'd be a bit suspicious of the contents of the index, if not the
> whole database, so an immediate dump,reinitdb,reload might be your
> most prudent course of action after you get it to start.
>

> Plan B would be to wipe out the WAL log with pg_resetxlog. This will
> allow you to start but the odds of having corrupt data afterwards would
> be about 100% ... you *must* dump and reload if you go that way.
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2005-04-01 16:10:05 Re: How to recover when can't start database
Previous Message Scott Marlowe 2005-04-01 15:54:39 Re: How to recover when can't start database