Re: emergency outage requiring database restart

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Kevin Grittner <kgrittn(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: emergency outage requiring database restart
Date: 2016-10-21 18:37:22
Message-ID: CAHyXU0z_6T-SGQeaiBV9VDNSROhyouqO8Ox0=WGCKKT9+-q+cA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 21, 2016 at 8:03 AM, Kevin Grittner <kgrittn(at)gmail(dot)com> wrote:
> On Tue, Oct 18, 2016 at 8:45 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>
>> Most or all the damage seemed to be to the system catalogs with
>> at least two critical tables dropped or inaccessible in some
>> fashion. A lot of the OIDs seemed to be pointing at the wrong
>> thing.
>
> While the oid in pg_class often matches the filename, that is not
> true after some operations (like CLUSTER or VACUUM FULL). It is
> the relfilenode column that is the definitive link to the file.

no such operations happened. In the first instance at least one table
dropped from the system catalogs. I have a hunch that the heap is
fine (supported by the size of the database on disk). At this
precise moment I'm restoring the database to another fileserver in
order to do some forensic analysis, also in the hopes of getting the
second database online in order to expedite recovery.

ah -- done. :-) deleting the init file didn't help, but starting up
single user allowed the start up to gracefully fail with a FATAL cache
lookup.

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2016-10-21 18:58:10 Re: emergency outage requiring database restart
Previous Message Stephen Frost 2016-10-21 18:03:27 Re: Renaming of pg_xlog and pg_clog