Database Recovery Procedures

From: Network Administrator <netadmin(at)vcsn(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Database Recovery Procedures
Date: 2003-09-16 19:29:36
Message-ID: 1063740576.3f6764a06040f@webmail.vcsn.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Looks like for the first time in 6 years, I'm experienced some database table
corruption. This was due to the space filling up on a server (you don't want to
know how that happened).

I have 3 tables corrupt and the others are fine (which I dumped to be safe). I
have a backup which I could use but then I realized that maybe there might be
some "surgery" I could perform to get the table "repaired". Note that the
normal recovery that the database does on its own did not work in this case.

I looked through the documentation (Admin 7.3.2) and I thought there was a
disaster recovery section but there is only "recovery" discussed as part of
backup/restore. If this information is out there somewhere else if someone
could provide a link that would be a great help as well.

My question/statement is that I think this is something that is important to
have. At least in regards to different strategies one could try to surgically
recover data BEFORE use the broad sword method of going to a backup. One of the
successful "sell" points I use to my clients is how resilient Linux/Unix
filesystems are. As well as Pg on Linux. In the case here, though I don't have
FS corruption so I'd like to know what should and could I do in this case.

Suggestions?

Oh and here is the output of a "select *" on one of the corrupt tables...

(saved as draft email here on 9/12/03)

..Ok, I was going to paste that in the email but now the database isn't coming
up at all. Here is the start up message

~~~
DEBUG: FindExec: found "/usr/local/pgsql/bin/postgres" using argv[0]
DEBUG: invoking IpcMemoryCreate(size=1466368)
DEBUG: FindExec: found "/usr/local/pgsql/bin/postmaster" using argv[0]
LOG: database system shutdown was interrupted at 2003-09-16 15:11:36 EDT
LOG: checkpoint record is at 5/2D497FC0
LOG: redo record is at 5/2D497FC0; undo record is at 0/0; shutdown TRUE
LOG: next transaction id: 5287090; next oid: 26471
LOG: database system was not properly shut down; automatic recovery in progress
LOG: ReadRecord: unexpected pageaddr 5/27498000 in log file 5, segment 45,
offset 4816896
LOG: redo is not required
PANIC: XLogWrite: write request 5/2D498000 is past end of log 5/2D498000
DEBUG: reaping dead processes
LOG: startup process (pid 17031) was terminated by signal 6
LOG: aborting startup due to startup process failure
DEBUG: proc_exit(1)
DEBUG: shmem_exit(1)
DEBUG: exit(1)
~~~

Thanks in advance to all

--
Keith C. Perry
Director of Networks & Applications
VCSN, Inc.
http://vcsn.com


____________________________________
This email account is being host by:
VCSN, Inc : http://vcsn.com

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Darko Prenosil 2003-09-16 19:44:29 Re: Red Hat 9 Postgres
Previous Message terry 2003-09-16 19:25:56 Re: Red Hat 9 Postgres