Re: Orphaned files in base/[oid]

From: Andres Freund <andres(at)anarazel(dot)de>
To: Chris Travers <chris(dot)travers(at)adjust(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Orphaned files in base/[oid]
Date: 2017-08-16 17:15:51
Message-ID: 20170816171551.moi5hzgomtmvql6q@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-08-16 14:20:02 +0200, Chris Travers wrote:
> So having throught about this a bit more, and having had some real-world
> experience with the script now, I have an idea that might work and some
> questions to make it succeed.
>
> My thinking is to add a new form of vacuum called VACUUM FSCK.
>
> This would:
> 1. lock pg_class in exclusive mode (or do I need exclusive access?), as
> this is needed to solve the race conditions. As I see, this seems to bring
> the database to a screeching halt concurrency-wise (but unlike my script
> would allow other databases to be accessed normally).
> 2. read the files where the name consists of only digits out of the
> filesystem and compare with oids from pg_class and relfilenodes
> 3. Any file not found in that list would then unlink it, as well as any
> files with the patter followed by an underscore or period.
>
> This would mean that the following cases would not be handled:
>
> If you have the first extent gone but later extents are present we check on
> the first extant, and so would not see the later ones. Same goes for
> visibility maps and other helper files.
>
> If you add a file in the directory which has a name like 34F3A222BC, that
> would never get cleaned up because it contains non-digits.
>
> So this leads to the following questions:
>
> 1. Is locking pg_class enough to avoid race conditions? Is exclusive mode
> sufficient or do I need exclusive access mode?
> 2. would it be preferable to move the file to a directory rather than
> unlinking it?
> 3. Should I perform any sort of check on the tables at the end to make
> sure everything is ok?

I think this entirely is the wrong approach. We shouldn't add weird
check commands that require locks on pg_class, we should avoid leaving
the orphaned files in the first place. I've upthread outlined
approached how to do so.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-08-16 17:16:50 Re: Hash Functions
Previous Message Andres Freund 2017-08-16 17:11:15 Re: Atomics for heap_parallelscan_nextpage()