Re: Cleaning up unreferenced table files

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Cleaning up unreferenced table files
Date: 2005-05-06 00:55:24
Message-ID: 8291.1115340924@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Applied.

Now that I've had a chance to look at it, this patch is thoroughly
broken. Problems observed in a quick review:

1. It doesn't work at all for non-default tablespaces: it will
claim that every file in such a tablespace is stale. The fact
that it does that rather than failing entirely is accidental.
It tries to read the database's pg_class in the target tablespace
whether it's there or not. Because the system is still in recovery
mode, the low-level routines allow the access to the nonexistent
pg_class table to pass --- in fact they think they should create
the file, so after it runs there's a bogus empty "1259" file in each
such tablespace (which of course it complains about, too). The code
then proceeds to think that pg_class is empty so of course everything
draws a warning.

2. It's not robust against stale subdirectories of a tablespace
(ie, subdirs corresponding to a nonexistent database) --- again,
it'll try to read a nonexistent pg_class. Then it'll produce a
bunch of off-target complaint messages.

3. It's assuming that relfilenode is unique database-wide, when no
such assumption is safe. We only have a guarantee that it's unique
tablespace-wide.

4. It fails to examine table segment files (such as "nnn.1"). These
should be complained of when the "nnn" doesn't match any hash entry.

5. It will load every relfilenode value in pg_class into the hashtable
whether it's meaningful or not. There should be a check on relkind.

6. I don't think relying on strtol to decide if a filename is entirely
numeric is very safe. Note all the extra defenses in pg_atoi against
various platform-specific misbehaviors of strtol. Personally I'd use a
strspn test instead.

7. There are no checks for readdir failure (compare any other readdir
loop in the backend).

See also Simon Riggs' complaints that the circumstances under which it's
done are pretty randomly selected. (One particular thing that I think
is a bad idea is to do this in a standalone backend. Any sort of
corruption in any db's pg_class would render it impossible to start up.)

To fix the first three problems, and also avoid the performance problem
of multiply rescanning a database's pg_class for each of its
tablespaces, I would suggest that the hashtable entries be widened to
RelFileNode structs (ie, db oid, tablespace oid, relfilenode oid). Then
there should be one iteration over pg_database to learn the OIDs and
default tablespaces of each database; with that you can read pg_class
from its correct location for each database and load all the entries
into the hashtable. Then you iterate through the tablespaces looking
for stuff not present in the hashtable. You might also want to build a
list or hashtable of known database OIDs, so that you can recognize a
stale subdirectory immediately and issue a direct complaint about it
without even recursing into it.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Kings-Lynne 2005-05-06 02:00:33 Re: Views, views, views! (long)
Previous Message elein 2005-05-06 00:48:30 Re: Views, views, views! (long)

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2005-05-06 03:43:00 Re: COPY CSV header line feature
Previous Message Bruce Momjian 2005-05-05 21:03:11 Re: [pgsql-hackers-win32] snprintf causes regression tests