Ignore lost+found when checking if a directory is empty

Lists: pgsql-hackers
From: Brian Pitts <bdp(at)uga(dot)edu>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: Ignore lost+found when checking if a directory is empty
Date: 2011-08-09 18:52:25
Message-ID: 4E4181E9.2010902@uga.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

When an ext2, ext3, or ext4 filesystem is mounted directly on the PGDATA directory, initdb will refuse to run because it sees the
lost+found directory that mke2fs created and assumes the PGDATA directory is already in use for something other than PostgreSQL.
Attached is a patch against master which will cause a directory that contains only lost+found to still be treated as empty.

This was previously proposed in 2001; see http://archives.postgresql.org/pgsql-hackers/2001-03/msg01194.php

--
Brian Pitts
Systems Administrator | EuPathDB Bioinformatics Resource Center
706-542-1447 | bdp(at)uga(dot)edu | http://eupathdb.org

Attachment Content-Type Size
Ignore-lost-found-when-checking-if-a-directory-is-empty.patch text/x-patch 0 bytes

From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Brian Pitts <bdp(at)uga(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Ignore lost+found when checking if a directory is empty
Date: 2011-08-09 20:03:24
Message-ID: CAJKUy5hjLoQRfWLJ2tT=K4pNyL+T6ajb-Hg29gtRnyfb9LYPww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Aug 9, 2011 at 1:52 PM, Brian Pitts <bdp(at)uga(dot)edu> wrote:
> When an ext2, ext3, or ext4 filesystem is mounted directly on the PGDATA directory, initdb will refuse to run because it sees the
> lost+found directory that mke2fs created and assumes the PGDATA directory is already in use for something other than PostgreSQL.
> Attached is a patch against master which will cause a directory that contains only lost+found to still be treated as empty.
>
> This was previously proposed in 2001; see http://archives.postgresql.org/pgsql-hackers/2001-03/msg01194.php
>

I have wanted that before, and the patch is very simple... Peter had a
concern about that though, still a concern?
"""
Initdb or the database system can do
anything they want in that directory, so it's not good to save lost blocks
somewhere in the middle, even if chances are low you need them. I say,
create a subdirectory.
"""

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Brian Pitts <bdp(at)uga(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Ignore lost+found when checking if a directory is empty
Date: 2011-08-09 20:03:26
Message-ID: 1312920206.20479.7.camel@jdavis-ux.asterdata.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2011-08-09 at 14:52 -0400, Brian Pitts wrote:
> When an ext2, ext3, or ext4 filesystem is mounted directly on the
> PGDATA directory, initdb will refuse to run because it sees the
> lost+found directory that mke2fs created and assumes the PGDATA
> directory is already in use for something other than PostgreSQL.
> Attached is a patch against master which will cause a directory that
> contains only lost+found to still be treated as empty.
>
> This was previously proposed in 2001; see
> http://archives.postgresql.org/pgsql-hackers/2001-03/msg01194.php

In the referenced discussion (10 years ago), Tom seemed OK with it and
Peter did not seem to like it much.

I think I agree with Peter here that it's not a very good idea, and I
don't see a big upside. With tablespaces it seems to make a little bit
more sense, but I'd still lean away from that idea.

Regards,
Jeff Davis


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Brian Pitts <bdp(at)uga(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Ignore lost+found when checking if a directory is empty
Date: 2011-08-09 20:08:03
Message-ID: 1312920483.20479.9.camel@jdavis-ux.asterdata.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2011-08-09 at 14:52 -0400, Brian Pitts wrote:
> Attached is a patch against master which will cause a directory that
> contains only lost+found to still be treated as empty.

Please add this to the September commitfest at:
https://commitfest.postgresql.org/

Regards,
Jeff Davis


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Brian Pitts <bdp(at)uga(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Ignore lost+found when checking if a directory is empty
Date: 2011-08-09 20:28:29
Message-ID: 12168.1312921709@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Brian Pitts <bdp(at)uga(dot)edu> writes:
> When an ext2, ext3, or ext4 filesystem is mounted directly on the PGDATA directory, initdb will refuse to run because it sees the
> lost+found directory that mke2fs created and assumes the PGDATA directory is already in use for something other than PostgreSQL.
> Attached is a patch against master which will cause a directory that contains only lost+found to still be treated as empty.

This has been proposed before, and rejected before, on the grounds that
you shouldn't be using a mount-point directory as a data directory
anyway. Better practice is to make a postgres-owned directory just
underneath the mount point. A couple of reasons for that are:

1. Mount-point directories should be owned by root, never by an
unprivileged account such as postgres. IIRC there are good security
reasons for this practice, though I don't recall all the details right
now.

2. Keeping the data directory one level down ensures a clean failure if
the disk is for some reason not mounted when Postgres starts, or goes
offline later. Otherwise, particularly if you're using a start script
that will automatically try an initdb, you might end up with some data
files on the / volume underneath where the mount point should have been.
This is sure to lead to serious problems when the disk does come back
online. There's at least one horror story in our archives from someone
who had an auto-initdb startup script and one day his NFS disk was a few
seconds slow to mount...

> This was previously proposed in 2001; see http://archives.postgresql.org/pgsql-hackers/2001-03/msg01194.php

It's been discussed more recently than that, I believe.

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Brian Pitts <bdp(at)uga(dot)edu>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Ignore lost+found when checking if a directory is empty
Date: 2011-08-09 20:36:30
Message-ID: 1312922055-sup-9409@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Jeff Davis's message of mar ago 09 16:03:26 -0400 2011:
> On Tue, 2011-08-09 at 14:52 -0400, Brian Pitts wrote:
> > When an ext2, ext3, or ext4 filesystem is mounted directly on the
> > PGDATA directory, initdb will refuse to run because it sees the
> > lost+found directory that mke2fs created and assumes the PGDATA
> > directory is already in use for something other than PostgreSQL.
> > Attached is a patch against master which will cause a directory that
> > contains only lost+found to still be treated as empty.
> >
> > This was previously proposed in 2001; see
> > http://archives.postgresql.org/pgsql-hackers/2001-03/msg01194.php
>
> In the referenced discussion (10 years ago), Tom seemed OK with it and
> Peter did not seem to like it much.
>
> I think I agree with Peter here that it's not a very good idea, and I
> don't see a big upside. With tablespaces it seems to make a little bit
> more sense, but I'd still lean away from that idea.

What if the init script tries to start postmaster before the filesystems
are mounted? ISTM requiring a subdir is a good sanity check that the
system is ready to run. Not creating stuff directly on the mountpoint
ensures consistency.

If you don't think this is a likely problem, search for Joe Conway's
report about a NFS share being unmounted for a while when postmaster was
started up, a couple of years ago. Yes, it's rare. Yes, it's real.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Brian Pitts <bdp(at)uga(dot)edu>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Ignore lost+found when checking if a directory is empty
Date: 2011-08-09 21:38:12
Message-ID: 13340.1312925892@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Excerpts from Jeff Davis's message of mar ago 09 16:03:26 -0400 2011:
>> I think I agree with Peter here that it's not a very good idea, and I
>> don't see a big upside. With tablespaces it seems to make a little bit
>> more sense, but I'd still lean away from that idea.

> What if the init script tries to start postmaster before the filesystems
> are mounted? ISTM requiring a subdir is a good sanity check that the
> system is ready to run. Not creating stuff directly on the mountpoint
> ensures consistency.

I went looking in the archives for previous discussions of this idea.
Most of them seem to focus on tablespaces rather than the primary data
directory, but the objections to doing it are pretty much the same
either way. The security concerns I mentioned seem to boil down to this
(from <25791(dot)1132238048(at)sss(dot)pgh(dot)pa(dot)us>):

Yeah, you *can* make it not-root-owned on most Unixen. That doesn't
mean it's a good idea to do so. For instance, if the root directory
is owned by Joe Luser, what's to stop him from blowing away lost+found
and thereby screwing up future fscks? You should basically never have
more-privileged objects (such as lost+found) inside directories owned by
less-privileged users --- it's just asking for trouble.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Brian Pitts <bdp(at)uga(dot)edu>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Ignore lost+found when checking if a directory is empty
Date: 2011-08-14 02:23:36
Message-ID: 201108140223.p7E2NaJ24955@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> > Excerpts from Jeff Davis's message of mar ago 09 16:03:26 -0400 2011:
> >> I think I agree with Peter here that it's not a very good idea, and I
> >> don't see a big upside. With tablespaces it seems to make a little bit
> >> more sense, but I'd still lean away from that idea.
>
> > What if the init script tries to start postmaster before the filesystems
> > are mounted? ISTM requiring a subdir is a good sanity check that the
> > system is ready to run. Not creating stuff directly on the mountpoint
> > ensures consistency.
>
> I went looking in the archives for previous discussions of this idea.
> Most of them seem to focus on tablespaces rather than the primary data
> directory, but the objections to doing it are pretty much the same

FYI, the 9.0+ code will create a subdirectory under the tablespace
directory named after the catversion number, and it doesn't check that
the directory is empty, particularly so pg_upgrade can do its magic.
So, I believe lost+found would work in such a case, but again, the
security issues are real.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +