Re: FATAL 2: open of pg_clog error

Lists: pgsql-general
From: "Bjoern Metzdorf" <bm(at)turtle-entertainment(dot)de>
To: <pgsql-general(at)postgresql(dot)org>
Subject: FATAL 2: open of pg_clog error
Date: 2002-03-05 12:02:29
Message-ID: 00dd01c1c43d$9fd26720$81c206d4@office.turtleentertainment.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi,

we are using Postgresql 7.2 on 2.4.17 SMP.

since this morning we are getting this error message while vacuuming:

2002-03-05 12:42:08 DEBUG: --Relation pg_toast_16854--
2002-03-05 12:42:10 FATAL 2: open of /raid/pgdata/pg_clog/0202 failed: No
such file or directory
2002-03-05 12:42:10 DEBUG: server process (pid 14201) exited with exit code
2
2002-03-05 12:42:10 DEBUG: terminating any other active server processes

A quick search in the archives
(http://archives.postgresql.org/pgsql-bugs/2002-01/msg00066.php) showed that
Tom fixed a potential problem in src/backend/utils/time/tqual.c v 1.46, but
7.2 has v 1.49 already.

We are vacuuming several tables every 5 minutes. A vacuum analyze brings up
the same error.

Any hints besides doing an initdb?

Greetings,
Bjoern

PS: ls -la /raid/pgdata/pg_clog gives this:

drwx------ 2 postgres postgres 4096 Mar 4 23:48 .
drwx------ 6 postgres postgres 4096 Mar 5 12:18 ..
-rw------- 1 postgres postgres 262144 Feb 14 12:52 0006
-rw------- 1 postgres postgres 262144 Feb 14 17:57 0007
-rw------- 1 postgres postgres 262144 Feb 14 22:10 0008
-rw------- 1 postgres postgres 262144 Feb 15 11:57 0009
-rw------- 1 postgres postgres 262144 Feb 15 17:25 000A
-rw------- 1 postgres postgres 262144 Feb 15 22:33 000B
-rw------- 1 postgres postgres 262144 Feb 16 12:51 000C
-rw------- 1 postgres postgres 262144 Feb 16 17:40 000D
-rw------- 1 postgres postgres 262144 Feb 16 23:01 000E
-rw------- 1 postgres postgres 262144 Feb 17 13:13 000F
-rw------- 1 postgres postgres 262144 Feb 17 17:31 0010
-rw------- 1 postgres postgres 262144 Feb 17 21:27 0011
-rw------- 1 postgres postgres 262144 Feb 18 10:44 0012
-rw------- 1 postgres postgres 262144 Feb 18 16:44 0013
-rw------- 1 postgres postgres 262144 Feb 18 20:43 0014
-rw------- 1 postgres postgres 262144 Feb 19 06:50 0015
-rw------- 1 postgres postgres 262144 Feb 19 15:52 0016
-rw------- 1 postgres postgres 262144 Feb 19 19:59 0017
-rw------- 1 postgres postgres 262144 Feb 20 00:23 0018
-rw------- 1 postgres postgres 262144 Feb 20 14:33 0019
-rw------- 1 postgres postgres 262144 Feb 20 18:37 001A
-rw------- 1 postgres postgres 262144 Feb 20 22:33 001B
-rw------- 1 postgres postgres 262144 Feb 21 12:53 001C
-rw------- 1 postgres postgres 262144 Feb 21 17:16 001D
-rw------- 1 postgres postgres 262144 Feb 21 20:58 001E
-rw------- 1 postgres postgres 262144 Feb 22 05:36 001F
-rw------- 1 postgres postgres 262144 Feb 22 15:42 0020
-rw------- 1 postgres postgres 262144 Feb 22 19:40 0021
-rw------- 1 postgres postgres 262144 Feb 23 00:20 0022
-rw------- 1 postgres postgres 262144 Feb 23 13:28 0023
-rw------- 1 postgres postgres 262144 Feb 23 17:47 0024
-rw------- 1 postgres postgres 262144 Feb 23 23:35 0025
-rw------- 1 postgres postgres 262144 Feb 24 12:52 0026
-rw------- 1 postgres postgres 262144 Feb 24 16:45 0027
-rw------- 1 postgres postgres 262144 Feb 24 20:25 0028
-rw------- 1 postgres postgres 262144 Feb 25 00:20 0029
-rw------- 1 postgres postgres 262144 Feb 25 15:18 002A
-rw------- 1 postgres postgres 262144 Feb 25 19:02 002B
-rw------- 1 postgres postgres 262144 Feb 25 22:15 002C
-rw------- 1 postgres postgres 262144 Feb 26 11:35 002D
-rw------- 1 postgres postgres 262144 Feb 26 16:35 002E
-rw------- 1 postgres postgres 262144 Feb 26 19:45 002F
-rw------- 1 postgres postgres 262144 Feb 26 22:55 0030
-rw------- 1 postgres postgres 262144 Feb 27 13:40 0031
-rw------- 1 postgres postgres 262144 Feb 27 17:34 0032
-rw------- 1 postgres postgres 262144 Feb 27 20:55 0033
-rw------- 1 postgres postgres 262144 Feb 28 01:31 0034
-rw------- 1 postgres postgres 262144 Feb 28 15:33 0035
-rw------- 1 postgres postgres 262144 Feb 28 19:11 0036
-rw------- 1 postgres postgres 262144 Feb 28 22:36 0037
-rw------- 1 postgres postgres 262144 Mar 1 12:29 0038
-rw------- 1 postgres postgres 262144 Mar 1 16:59 0039
-rw------- 1 postgres postgres 262144 Mar 1 21:27 003A
-rw------- 1 postgres postgres 262144 Mar 2 09:23 003B
-rw------- 1 postgres postgres 262144 Mar 2 15:20 003C
-rw------- 1 postgres postgres 262144 Mar 2 20:03 003D
-rw------- 1 postgres postgres 262144 Mar 3 03:14 003E
-rw------- 1 postgres postgres 262144 Mar 3 14:39 003F
-rw------- 1 postgres postgres 262144 Mar 3 18:25 0040
-rw------- 1 postgres postgres 262144 Mar 3 21:56 0041
-rw------- 1 postgres postgres 262144 Mar 4 09:41 0042
-rw------- 1 postgres postgres 262144 Mar 4 16:22 0043
-rw------- 1 postgres postgres 262144 Mar 4 19:57 0044
-rw------- 1 postgres postgres 262144 Mar 4 23:48 0045
-rw------- 1 postgres postgres 155648 Mar 5 12:54 0046


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Bjoern Metzdorf" <bm(at)turtle-entertainment(dot)de>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: FATAL 2: open of pg_clog error
Date: 2002-03-05 16:26:39
Message-ID: 15948.1015345599@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

"Bjoern Metzdorf" <bm(at)turtle-entertainment(dot)de> writes:
> since this morning we are getting this error message while vacuuming:

> 2002-03-05 12:42:08 DEBUG: --Relation pg_toast_16854--
> 2002-03-05 12:42:10 FATAL 2: open of /raid/pgdata/pg_clog/0202 failed: No
> such file or directory

Given that you don't have any actual clog segments beyond 0046, it would
seem that pg_toast_16854 contains a trashed tuple --- specifically, one
having a bogus xmin or xmax that's far beyond the existing range of
transaction IDs.

> Any hints besides doing an initdb?

You shouldn't need to initdb to get out of a problem with just one
table. I'd look in pg_class to see which table this is the toast table
for (look for reltoastrelid = (oid of pg_toast_16854)). Then see if
you can pg_dump that one table. If so, drop the table and reload from
the dump. If not, consider dropping the table anyway --- it beats
initdb for your whole database.

Another interesting question is whether the problem stems from a
hardware fault (eg, disk dropped a few bytes) or software (did Postgres
screw up?) Perhaps you could just rename the broken table out of the
way, instead of dropping it, so as to preserve it for future analysis.
I for one would be interested in looking at the broken data.

regards, tom lane