Re: Very bizarre bug with corrupted index

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org, mail(at)joeconway(dot)com
Subject: Re: Very bizarre bug with corrupted index
Date: 2003-05-03 18:44:44
Message-ID: 200305031144.44029.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Tom,

> Did you shut down the postmaster (or at least force a checkpoint)
> before examining the busted index? It's barely possible that the zeroes
> on disk didn't correspond to what was in buffer cache. How big was this
> index, anyway?

The index is pretty small, actually, (< 20 indexed rows in the table) although
it does suffer severe attenuation between VACUUMS due to numerous updates (up
to 100,000 discarded rows in some periods btw. 5min. VACUUM.)

However, I did find out that they did *not* shut down Postmaster before
copying the file. So the contents are not reliable. Unfortunately, the
original installation is long gone, so we'll have to wait until it happens
again (or, more likely, not) to analyze.

Fortunately or not, it may not happen again; they're running this version of
the database on 3-5 overloaded test systems for the last month+ and this is
the first such error.

> But I'm inclined to blame the filesystem --- I can't think of any
> plausible mechanism in Postgres that would zero out all of a file.
> What filesystem are you using?

Ext3, as it turns out.

--
Josh Berkus
Aglio Database Solutions
San Francisco

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2003-05-03 22:29:30 Re: pg_dump Crashes and core dumps
Previous Message Josh Berkus 2003-05-03 18:39:38 Re: Large object corruption