Re: On file locking

Lists: pgsql-hackers
From: Kevin Brown <kevin(at)sysexperts(dot)com>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: On file locking
Date: 2003-01-31 03:23:54
Message-ID: 20030131032354.GL12957@filer
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I've been looking at the PID file creation mechanism we currently use.
It goes through a loop in an attempt to create the PID file, and if
one is there it attempts to remove it if the PID it contains no longer
exists (there are checks for shared memory usage as well).

This could be cleaned up rather dramatically if we were to use one of
the file locking primitives supplied by the OS to grab an exclusive
lock on the file, and the upside is that, when the locking code is
used, the postmaster would *know* whether or not there's another
postmaster running, but the price for that is that we'd have to eat a
file descriptor (closing the file means losing the lock), and we'd
still have to retain the old code anyway in the event that there is no
suitable file locking mechanism to use on the platform in question.

The first question for the group is: is it worth doing that?

The second question for the group is: if we do indeed decide to do
file locking in that manner, what *other* applications of the OS-level
file locking mechanism will we have? Some of them allow you to lock
sections of a file, for instance, while others apply a lock on the
entire file. It's not clear to me that the former will be available
on all the platforms we're interested in, so locking the entire file
is probably the only thing we can really count on (and keep in mind
that even if an API to lock sections of a file is available, it may
well be that it's implemented by locking the entire file anyway).

What I had in mind was implementation of a file locking function that
would take a file descriptor and a file range. If the underlying OS
mechanism supported it, it would lock that range. The interesting
case is when the underlying OS mechanism did *not* support it. Would
it be more useful in that case to return an error indication? Would
it be more useful to simply lock the entire file? If no underlying
file locking mechanism is available, it seems obvious to me that the
function would have to always return an error.

Thoughts?

--
Kevin Brown kevin(at)sysexperts(dot)com


From: "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
To: "Kevin Brown" <kevin(at)sysexperts(dot)com>, "PostgreSQL Development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On file locking
Date: 2003-01-31 03:54:57
Message-ID: GNELIHDDFBOCMGBFGEFOKEFACFAA.chriskl@familyhealth.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mmy problem is freebsd getting totally loaded at which point it sends kills
to various processes. This sometime seems to end up with several actual
postmasters running, and none of them working.

Better existing process detection would help that greatly I'm sure.

Chris

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org]On Behalf Of Kevin Brown
> Sent: Friday, 31 January 2003 11:24 AM
> To: PostgreSQL Development
> Subject: [HACKERS] On file locking
>
>
> I've been looking at the PID file creation mechanism we currently use.
> It goes through a loop in an attempt to create the PID file, and if
> one is there it attempts to remove it if the PID it contains no longer
> exists (there are checks for shared memory usage as well).
>
> This could be cleaned up rather dramatically if we were to use one of
> the file locking primitives supplied by the OS to grab an exclusive
> lock on the file, and the upside is that, when the locking code is
> used, the postmaster would *know* whether or not there's another
> postmaster running, but the price for that is that we'd have to eat a
> file descriptor (closing the file means losing the lock), and we'd
> still have to retain the old code anyway in the event that there is no
> suitable file locking mechanism to use on the platform in question.
>
> The first question for the group is: is it worth doing that?
>
> The second question for the group is: if we do indeed decide to do
> file locking in that manner, what *other* applications of the OS-level
> file locking mechanism will we have? Some of them allow you to lock
> sections of a file, for instance, while others apply a lock on the
> entire file. It's not clear to me that the former will be available
> on all the platforms we're interested in, so locking the entire file
> is probably the only thing we can really count on (and keep in mind
> that even if an API to lock sections of a file is available, it may
> well be that it's implemented by locking the entire file anyway).
>
> What I had in mind was implementation of a file locking function that
> would take a file descriptor and a file range. If the underlying OS
> mechanism supported it, it would lock that range. The interesting
> case is when the underlying OS mechanism did *not* support it. Would
> it be more useful in that case to return an error indication? Would
> it be more useful to simply lock the entire file? If no underlying
> file locking mechanism is available, it seems obvious to me that the
> function would have to always return an error.
>
>
> Thoughts?
>
>
>
> --
> Kevin Brown kevin(at)sysexperts(dot)com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kevin Brown <kevin(at)sysexperts(dot)com>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On file locking
Date: 2003-01-31 04:26:12
Message-ID: 5842.1043987172@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kevin Brown <kevin(at)sysexperts(dot)com> writes:
> This could be cleaned up rather dramatically if we were to use one of
> the file locking primitives supplied by the OS to grab an exclusive
> lock on the file, and the upside is that, when the locking code is
> used, the postmaster would *know* whether or not there's another
> postmaster running, but the price for that is that we'd have to eat a
> file descriptor (closing the file means losing the lock),

Yeah, I was just thinking about that this morning. Eating one file
descriptor in the postmaster is absolutely no problem --- the postmaster
doesn't have all that many files open anyhow. What I was wondering was
whether it was worth eating an FD for every backend process, by holding
open the file inherited from the postmaster. If we did that, we would
have a reliable way of detecting that the old postmaster died but left
surviving child backends. (As I mentioned in a nearby flamefest, the
existing interlock for this situation strikes me as mighty fragile.)

But this only wins if a child process inheriting an open file also
inherits copies of any locks held by the parent. If not, then the
issue is moot. Anybody have any idea if file locks work that way?
Is it portable??

> The second question for the group is: if we do indeed decide to do
> file locking in that manner, what *other* applications of the OS-level
> file locking mechanism will we have?

I can't see any use in partial-file locks for us, and would not want
to design an internal API that expects them to work.

regards, tom lane


From: Rod Taylor <rbt(at)rbt(dot)ca>
To: Kevin Brown <kevin(at)sysexperts(dot)com>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On file locking
Date: 2003-01-31 04:42:03
Message-ID: 1043988122.62258.52.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> file descriptor (closing the file means losing the lock), and we'd
> still have to retain the old code anyway in the event that there is no
> suitable file locking mechanism to use on the platform in question.

What is the gain given the above statement? If what we currently do can
cause issues (fail), then beefing it up where available may be useful --
but otherwise it's just additional code.
--
Rod Taylor <rbt(at)rbt(dot)ca>

PGP Key: http://www.rbt.ca/rbtpub.asc


From: "Shridhar Daithankar<shridhar_daithankar(at)persistent(dot)co(dot)in>" <shridhar_daithankar(at)persistent(dot)co(dot)in>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On file locking
Date: 2003-01-31 06:50:41
Message-ID: 200301311220.41676.shridhar_daithankar@persistent.co.in
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Friday 31 Jan 2003 9:56 am, you wrote:
> Kevin Brown <kevin(at)sysexperts(dot)com> writes:
> But this only wins if a child process inheriting an open file also
> inherits copies of any locks held by the parent. If not, then the
> issue is moot. Anybody have any idea if file locks work that way?
> Is it portable??

In my experience of HP-UX and linux, they do differ. How much, I don't
remember.

I have a stupid proposal. Keep file lock aside. I think shared memory can be
kept alive even after process dies. Why not write a shared memory segment id
to a file and let postmaster check that segment. That would be much easier.

Besides file locking is implemented using setgid bit on most unices. And
everybody is free to do what he/she thinks right with it.

May be stupid but just a thought..

Shridhar


From: Kevin Brown <kevin(at)sysexperts(dot)com>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On file locking
Date: 2003-01-31 08:46:39
Message-ID: 20030131084639.GM12957@filer
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> But this only wins if a child process inheriting an open file also
> inherits copies of any locks held by the parent. If not, then the
> issue is moot. Anybody have any idea if file locks work that way?
> Is it portable??

An alternate way might be to use semaphores, but I can't see how to do
that using the standard PGSemaphores implementation: it appears to
depend on cooperating processes inheriting a copy of the postmaster's
heap.

And since the POSIX semaphores default to unnamed ones, it appears
this idea is also a dead end unless my impressions are dead wrong...

--
Kevin Brown kevin(at)sysexperts(dot)com


From: Antti Haapala <antti(dot)haapala(at)iki(dot)fi>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Brown <kevin(at)sysexperts(dot)com>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On file locking
Date: 2003-01-31 12:33:39
Message-ID: Pine.GSO.4.44.0301311425400.26712-100000@paju.oulu.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> But this only wins if a child process inheriting an open file also
> inherits copies of any locks held by the parent. If not, then the
> issue is moot. Anybody have any idea if file locks work that way?
> Is it portable??

From RedHat 8.0 manages fork(2):

SYNOPSIS
#include <sys/types.h>
#include <unistd.h>

pid_t fork(void);

DESCRIPTION
fork creates a child process that differs from the parent process only
in its PID and PPID, and in the fact that resource utilizations are set
to 0. File locks and pending signals are not inherited.
^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^

And from SunOS 5.8 flock
Locks are on files, not file descriptors. That is, file
descriptors duplicated through dup(2) or fork(2) do not
result in multiple instances of a lock, but rather multiple
references to a single lock. If a process holding a lock on
a file forks and the child explicitly unlocks the file, the
parent will lose its lock. Locks are not inherited by a
child process.

If I understand correctly it says that if parent dies, file is unlocked no
matter if there's children still running?

--
Antti Haapala


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Antti Haapala <antti(dot)haapala(at)iki(dot)fi>
Cc: Kevin Brown <kevin(at)sysexperts(dot)com>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On file locking
Date: 2003-01-31 15:34:52
Message-ID: 8834.1044027292@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Antti Haapala <antti(dot)haapala(at)iki(dot)fi> writes:
> And from SunOS 5.8 flock
> Locks are on files, not file descriptors. That is, file
> descriptors duplicated through dup(2) or fork(2) do not
> result in multiple instances of a lock, but rather multiple
> references to a single lock. If a process holding a lock on
> a file forks and the child explicitly unlocks the file, the
> parent will lose its lock. Locks are not inherited by a
> child process.

That seems self-contradictory. If the fork results in multiple
references to the open file, then I should think that if the parent
dies but the child still holds the file open, then the lock still
exists. Seems that some experimentation is called for ...

regards, tom lane


From: Curt Sampson <cjs(at)cynic(dot)net>
To: "Shridhar Daithankar<shridhar_daithankar(at)persistent(dot)co(dot)in>" <shridhar_daithankar(at)persistent(dot)co(dot)in>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On file locking
Date: 2003-02-01 00:26:37
Message-ID: Pine.NEB.4.51.0302010924120.517@angelic.cynic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 31 Jan 2003, Shridhar Daithankar<shridhar_daithankar(at)persistent(dot)co(dot)in> wrote:

> Besides file locking is implemented using setgid bit on most unices. And
> everybody is free to do what he/she thinks right with it.

I don't believe it's implemented with the setgid bit on most Unices. As
I recall, it's certainly not on Xenix, SCO Unix, any of the BSDs, Linux,
SunOS, Solaris, and Tru64 Unix.

(I'm talking about the flock system call, here.)

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC


From: Curt Sampson <cjs(at)cynic(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Antti Haapala <antti(dot)haapala(at)iki(dot)fi>, Kevin Brown <kevin(at)sysexperts(dot)com>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On file locking
Date: 2003-02-01 06:11:28
Message-ID: Pine.NEB.4.51.0302011509320.610@angelic.cynic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 31 Jan 2003, Tom Lane wrote:

> Antti Haapala <antti(dot)haapala(at)iki(dot)fi> writes:
> > And from SunOS 5.8 flock
> > Locks are on files, not file descriptors. That is, file
> > descriptors duplicated through dup(2) or fork(2) do not
> > result in multiple instances of a lock, but rather multiple
> > references to a single lock. If a process holding a lock on
> > a file forks and the child explicitly unlocks the file, the
> > parent will lose its lock. Locks are not inherited by a
> > child process.
>
> That seems self-contradictory.

Yes. I note that in NetBSD, that paragraph of the manual page is
identical except that the last sentence has been removed.

At any rate, it seems to me highly unlikely that, since the child has
the *same* descriptor as the parent had, that the lock would disappear.

The other option would be that the lock belongs to the process, in which
case one would think that a child doing an unlock should not affect the
parent, because it's a different process....

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC


From: Kevin Brown <kevin(at)sysexperts(dot)com>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On file locking
Date: 2003-02-01 16:31:04
Message-ID: 20030201163103.GO12957@filer
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Curt Sampson wrote:
> On Fri, 31 Jan 2003, Shridhar Daithankar<shridhar_daithankar(at)persistent(dot)co(dot)in> wrote:
>
> > Besides file locking is implemented using setgid bit on most unices. And
> > everybody is free to do what he/she thinks right with it.
>
> I don't believe it's implemented with the setgid bit on most Unices. As
> I recall, it's certainly not on Xenix, SCO Unix, any of the BSDs, Linux,
> SunOS, Solaris, and Tru64 Unix.
>
> (I'm talking about the flock system call, here.)

Linux, at least, supports mandatory file locks. The Linux kernel
documentation mentions that you're supposed to use fcntl() or lockf()
(the latter being a library wrapper around the former) to actually
lock the file but, when those operations are applied to a file that
has the setgid bit set but without the group execute bit set, the
kernel enforces it as a mandatory lock. That means that operations
like open(), read(), and write() initiated by other processes on the
same file will block (or return EAGAIN, if O_NONBLOCK was used to open
it) if that's what the lock on the file calls for.

That same documentation mentions that locks acquired using flock()
will *not* invoke the mandatory lock semantics even if on a file
marked for it, so I guess flock() isn't implemented on top of fcntl()
in Linux.

So if we wanted to make use of mandatory locks, we'd have to refrain
from using flock().

--
Kevin Brown kevin(at)sysexperts(dot)com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kevin Brown <kevin(at)sysexperts(dot)com>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On file locking
Date: 2003-02-01 16:52:28
Message-ID: 7594.1044118348@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kevin Brown <kevin(at)sysexperts(dot)com> writes:
> So if we wanted to make use of mandatory locks, we'd have to refrain
> from using flock().

We have no need for mandatory locks; the advisory style will do fine.
This is true because we have no desire to interoperate with any
non-Postgres code ... everyone else is supposed to stay the heck out of
$PGDATA.

regards, tom lane


From: Kevin Brown <kevin(at)sysexperts(dot)com>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On file locking
Date: 2003-02-01 17:11:45
Message-ID: 20030201171145.GQ12957@filer
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Kevin Brown <kevin(at)sysexperts(dot)com> writes:
> > So if we wanted to make use of mandatory locks, we'd have to refrain
> > from using flock().
>
> We have no need for mandatory locks; the advisory style will do fine.
> This is true because we have no desire to interoperate with any
> non-Postgres code ... everyone else is supposed to stay the heck out of
> $PGDATA.

True. But, of course, mandatory locks could be used to *make*
everyone else stay out of $PGDATA. :-)

--
Kevin Brown kevin(at)sysexperts(dot)com


From: Antti Haapala <antti(dot)haapala(at)iki(dot)fi>
To: Kevin Brown <kevin(at)sysexperts(dot)com>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On file locking
Date: 2003-02-03 10:29:47
Message-ID: Pine.GSO.4.44.0302031225580.8837-100000@paju.oulu.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> That same documentation mentions that locks acquired using flock()
> will *not* invoke the mandatory lock semantics even if on a file
> marked for it, so I guess flock() isn't implemented on top of fcntl()
> in Linux.

They're not. And there's another difference between fcntl and flock in
Linux: although fork(2) states that file locks are not inherited, locks
made by flock are inherited to children and they keep the lock even when
the parent process is killed with SIGKILL. Tested this.

Just see man syscall, there exists both
flock(2)
and
fcntl(2)

--
Antti Haapala
+358 50 369 3535
ICQ: #177673735