Quick Links

semaphore usage "port based"?

Lists:	pgsql-hackers

From:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	semaphore usage "port based"?
Date:	2006-04-02 19:52:17
Message-ID:	20060402163504.T947@ganymede.hub.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I've got an odd issue that I'm not sure how to fix ... or, if fixing is
even possible ...

I just put into place a FreeBSD 6.x server ... it has 2 jails running on
it, and inside of each, I'm trying to run a PostgreSQL 7.4.12 server
(OpenACS requirement, no choice there) ...

Now, on my older FreeBSD 4.x servers, I have about 17 PostgreSQL servers
(some 7.2, some 7.4, some 8.x) ... and they all run fine, and they all run
on port 5432 ...

Now, something in FreeBSD has changed since 4.x that, if you start up a
second PostgreSQL server on port 5432, the first one starts to generate
"semctl: Invalid argument" errors ...

If I move one to port 5433, both run great ...

Now, since this *did* work fine with 4.x, the FreeBSD developers have
obviously changed something that is causing it not to work ... but, since
'changing port' appears to fix it, I'm wondering if there is something in
our Semaphore creation code that can be tweaked so that the semaphore side
of things *thinks* its running on a different port, but it still responses
to port 5432?

Or, more simply, I think ... is there somewhere in the Semaphore code that
is using the port # as a 'seed'?

I'm trying to attack things from the FreeBSD side too, to find out what
has changed, and how to fix it, but figured I might be able to come up
with a quicker fix from this group ...

Thx ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664

From:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-02 20:58:17
Message-ID:	20060402175303.C947@ganymede.hub.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

'k, an excerpt from a thread on the freebsd lists ... I'm not sure how to
answer:

----
On Sun, Apr 02, 2006 at 05:24:10PM -0300, Marc G. Fournier wrote:
> On Sun, 2 Apr 2006, Kris Kennaway wrote:
>
> >>Right, but why are they doing it *consistently* in FreeBSD 6.x, when
they
> >>never did it in FreeBSD 4.x? I have postmaster processes running on
the
> >>FreeBSD box as far back as November 27th, 2005 ... and have *never*
> >>experienced this problem ... so it isn't PostgreSQL that has changed,
> >>something in FreeBSD has changed :(
> >
> >You'll need to do some debugging to find out which of the two causes
> >of EINVAL are true here (or some undocumented cause).
>
> 'k, right now, the checks in PostgreSQL are just seeing if the result of
> semctl < 0 ... i see from the man page what 'two values' of EINVAL you
are
> referring to ... but, if they both return the same ERRNO, how do I
> determine which of the two is the cause of the problem? :(

Evaluate context: what other semaphore operations have been performed
previously?

Kris
------

is there any easy way to answer this? I'm getting the Invalid Argument
error for SETVAL and IPC_RMID ...

On Sun, 2 Apr 2006, Marc G. Fournier wrote:

>
> I've got an odd issue that I'm not sure how to fix ... or, if fixing is even
> possible ...
>
> I just put into place a FreeBSD 6.x server ... it has 2 jails running on it,
> and inside of each, I'm trying to run a PostgreSQL 7.4.12 server (OpenACS
> requirement, no choice there) ...
>
> Now, on my older FreeBSD 4.x servers, I have about 17 PostgreSQL servers
> (some 7.2, some 7.4, some 8.x) ... and they all run fine, and they all run on
> port 5432 ...
>
> Now, something in FreeBSD has changed since 4.x that, if you start up a
> second PostgreSQL server on port 5432, the first one starts to generate
> "semctl: Invalid argument" errors ...
>
> If I move one to port 5433, both run great ...
>
> Now, since this *did* work fine with 4.x, the FreeBSD developers have
> obviously changed something that is causing it not to work ... but, since
> 'changing port' appears to fix it, I'm wondering if there is something in our
> Semaphore creation code that can be tweaked so that the semaphore side of
> things *thinks* its running on a different port, but it still responses to
> port 5432?
>
> Or, more simply, I think ... is there somewhere in the Semaphore code that is
> using the port # as a 'seed'?
>
> I'm trying to attack things from the FreeBSD side too, to find out what has
> changed, and how to fix it, but figured I might be able to come up with a
> quicker fix from this group ...
>
> Thx ...
>
>
> ----
> Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
> Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-02 22:23:24
Message-ID:	25422.1144016604@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Marc G. Fournier" <scrappy(at)postgresql(dot)org> writes:
> Or, more simply, I think ... is there somewhere in the Semaphore code that
> is using the port # as a 'seed'?

We use the port number as a basis for selecting the semaphore key (see
semget(2)). There is code in there to pick a different key value if the
one we first selected appears to be in use; that has to work correctly
if you're going to run multi postmasters on the same port number. It
sounds like FBSD 6 has done something that broke the key-in-use check.

Look at IpcSemaphoreCreate and InternalIpcSemaphoreCreate in
src/backend/port/sysv_sema.c. It may be worth stepping through them
with gdb to see what the semget calls are returning.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-02 22:36:28
Message-ID:	25526.1144017388@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I wrote:
> Look at IpcSemaphoreCreate and InternalIpcSemaphoreCreate in
> src/backend/port/sysv_sema.c. It may be worth stepping through them
> with gdb to see what the semget calls are returning.

BTW, even before doing that, you should look at "ipcs -s" output to try
to get a clue what's going on. The EINVAL failures may be because the
second postmaster to start deletes the semaphores created by the first
one. You could easily see this happening in before-and-after ipcs data
if so.

strace'ing startup of the second postmaster is another approach that
might be easier than gdb'ing.

regards, tom lane

From:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	freebsd-stable(at)freebsd(dot)org, Kris Kennaway <kris(at)obsecurity(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 00:56:32
Message-ID:	20060402213921.V947@ganymede.hub.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, 2 Apr 2006, Tom Lane wrote:

> I wrote:
>> Look at IpcSemaphoreCreate and InternalIpcSemaphoreCreate in
>> src/backend/port/sysv_sema.c. It may be worth stepping through them
>> with gdb to see what the semget calls are returning.
>
> BTW, even before doing that, you should look at "ipcs -s" output to try
> to get a clue what's going on. The EINVAL failures may be because the
> second postmaster to start deletes the semaphores created by the first
> one. You could easily see this happening in before-and-after ipcs data
> if so.

You are right ...

Before:

Semaphores:
T ID KEY MODE OWNER GROUP CREATOR CGROUP NSEMS OTIME CTIME
s 524288 5432001 --rw------- 70 70 70 70 17 14:44:19 14:44:19
s 524289 5432002 --rw------- 70 70 70 70 17 14:44:19 14:44:19
s 524290 5432003 --rw------- 70 70 70 70 17 14:44:19 14:44:19
s 524291 5432004 --rw------- 70 70 70 70 17 14:44:19 14:44:19
s 524292 5432005 --rw------- 70 70 70 70 17 14:44:19 14:44:19
s 524293 5432006 --rw------- 70 70 70 70 17 20:23:56 14:44:19
s 524294 5432007 --rw------- 70 70 70 70 17 20:23:58 14:44:19

After:

Semaphores:
T ID KEY MODE OWNER GROUP CREATOR CGROUP NSEMS OTIME CTIME
s 589824 5432001 --rw------- 70 70 70 70 17 21:38:03 21:38:03
s 589825 5432002 --rw------- 70 70 70 70 17 21:38:03 21:38:03
s 589826 5432003 --rw------- 70 70 70 70 17 21:38:03 21:38:03
s 589827 5432004 --rw------- 70 70 70 70 17 21:38:03 21:38:03
s 589828 5432005 --rw------- 70 70 70 70 17 21:38:03 21:38:03
s 589829 5432006 --rw------- 70 70 70 70 17 21:38:03 21:38:03
s 589830 5432007 --rw------- 70 70 70 70 17 21:38:03 21:38:03

So, our semget() is trying to acquire 5432001, FreeBSD's semget is
reporting back that its not in use, so the second instance if basically
'punting' the original one off of it ...

Kris, from the PostgreSQL sources, here is where we try and set the semId
to use ... is there something we are doing wrong with our code as far as
FreeBSD 6.x is concerned, such that semget is not returning a negative
value when the key is already in use? Or is there a problem with semget()
in a jail such that it is allowing for the KEY to be reused, instead of
returning a negative value?

========
static IpcSemaphoreId
InternalIpcSemaphoreCreate(IpcSemaphoreKey semKey, int numSems)
{
int semId;

semId = semget(semKey, numSems, IPC_CREAT | IPC_EXCL | IPCProtection);

if (semId < 0)
{
/*
* Fail quietly if error indicates a collision with existing set.
* One would expect EEXIST, given that we said IPC_EXCL, but
* perhaps we could get a permission violation instead? Also,
* EIDRM might occur if an old set is slated for destruction but
* not gone yet.
*/
if (errno == EEXIST || errno == EACCES
#ifdef EIDRM
|| errno == EIDRM
#endif
)
return -1;

/*
* Else complain and abort
*/
ereport(FATAL,
(errmsg("could not create semaphores: %m"),
errdetail("Failed system call was semget(%d, %d, 0%o).",
(int) semKey, numSems,
IPC_CREAT | IPC_EXCL | IPCProtection),
(errno == ENOSPC) ?
errhint("This error does *not* mean that you have run out of disk space.\n"
"It occurs when either the system limit for the maximum number of "
"semaphore sets (SEMMNI), or the system wide maximum number of "
"semaphores (SEMMNS), would be exceeded. You need to raise the "
"respective kernel parameter. Alternatively, reduce PostgreSQL's "
"consumption of semaphores by reducing its max_connections parameter "
"(currently %d).\n"
"The PostgreSQL documentation contains more information about "
"configuring your system for PostgreSQL.",
MaxBackends) : 0));
}

return semId;
}
========

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc:	freebsd-stable(at)freebsd(dot)org, Kris Kennaway <kris(at)obsecurity(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 01:06:25
Message-ID:	26524.1144026385@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Marc G. Fournier" <scrappy(at)postgresql(dot)org> writes:
> On Sun, 2 Apr 2006, Tom Lane wrote:
>> BTW, even before doing that, you should look at "ipcs -s" output to try
>> to get a clue what's going on. The EINVAL failures may be because the
>> second postmaster to start deletes the semaphores created by the first
>> one. You could easily see this happening in before-and-after ipcs data
>> if so.

> You are right ...

OK, could we see strace (or whatever BSD calls it) output for the second
postmaster? I'd like to see exactly what results it's getting for the
kernel calls it makes during IpcSemaphoreCreate.

regards, tom lane

From:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org, freebsd-stable(at)freebsd(dot)org, Kris Kennaway <kris(at)obsecurity(dot)org>
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 01:31:39
Message-ID:	20060402222843.X947@ganymede.hub.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, 2 Apr 2006, Tom Lane wrote:

> "Marc G. Fournier" <scrappy(at)postgresql(dot)org> writes:
>> On Sun, 2 Apr 2006, Tom Lane wrote:
>>> BTW, even before doing that, you should look at "ipcs -s" output to try
>>> to get a clue what's going on. The EINVAL failures may be because the
>>> second postmaster to start deletes the semaphores created by the first
>>> one. You could easily see this happening in before-and-after ipcs data
>>> if so.
>
>> You are right ...
>
> OK, could we see strace (or whatever BSD calls it) output for the second
> postmaster? I'd like to see exactly what results it's getting for the
> kernel calls it makes during IpcSemaphoreCreate.

'k, dont' know what strace is ... we have ktrace and truss ... truss is
what I usually use, and is:

DESCRIPTION
The truss utility traces the system calls called by the specified process
or program. Output is to the specified output file, or standard error by
default. It does this by stopping and restarting the process being moni-
tored via procfs(5).

And shows output like:

# truss ls
ioctl(1,TIOCGETA,0x7fbff514) = 0 (0x0)
ioctl(1,TIOCGWINSZ,0x7fbff588) = 0 (0x0)
getuid() = 0 (0x0)
readlink("/etc/malloc.conf",0x7fbff470,63) ERR#2 'No such file or directory'
mmap(0x0,4096,0x3,0x1002,-1,0x0) = 671666176 (0x2808d000)
break(0x809b000) = 0 (0x0)
break(0x809c000) = 0 (0x0)
break(0x809d000) = 0 (0x0)
break(0x809e000) = 0 (0x0)
stat(".",0x7fbff470) = 0 (0x0)
open(".",0x0,00) = 3 (0x3)
fchdir(0x3) = 0 (0x0)
open(".",0x0,00) = 4 (0x4)
stat(".",0x7fbff430) = 0 (0x0)
open(".",0x4,00) = 5 (0x5)
fstat(5,0x7fbff430) = 0 (0x0)
fcntl(0x5,0x2,0x1) = 0 (0x0)
__sysctl(0x7fbff2e8,0x2,0x8098760,0x7fbff2e4,0x0,0x0) = 0 (0x0)
fstatfs(0x5,0x7fbff330) = 0 (0x0)
break(0x809f000) = 0 (0x0)
getdirentries(0x5,0x809e000,0x1000,0x809a0b4) = 512 (0x200)
getdirentries(0x5,0x809e000,0x1000,0x809a0b4) = 0 (0x0)
lseek(5,0x0,0) = 0 (0x0)
close(5) = 0 (0x0)
fchdir(0x4) = 0 (0x0)
close(4) = 0 (0x0)
fstat(1,0x7fbff270) = 0 (0x0)
break(0x80a0000) = 0 (0x0)
ioctl(1,TIOCGETA,0x7fbff2a4) = 0 (0x0)
.cshrc .cvspass .history .login .psql_history .ssh
write(1,0x809f000,53) = 53 (0x35)
.cshrc~ .emacs.d .klogin .profile .rnd ktrace.out
write(1,0x809f000,53) = 53 (0x35)
exit(0x0) process exit, rval = 0

ktrace is:

DESCRIPTION
The ktrace utility enables kernel trace logging for the specified pro-
cesses. Kernel trace data is logged to the file ktrace.out. The kernel
operations that are traced include system calls, namei translations, sig-
nal processing, and I/O.

And shows output like:

86523 ls RET __sysctl 0
86523 ls CALL fstatfs(0x5,0x7fbff330)
86523 ls RET fstatfs 0
86523 ls CALL break(0x809f000)
86523 ls RET break 0
86523 ls CALL getdirentries(0x5,0x809e000,0x1000,0x809a0b4)
86523 ls RET getdirentries 512/0x200
86523 ls CALL getdirentries(0x5,0x809e000,0x1000,0x809a0b4)
86523 ls RET getdirentries 0
86523 ls CALL lseek(0x5,0,0,0,0)

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc:	pgsql-hackers(at)postgresql(dot)org, freebsd-stable(at)freebsd(dot)org, Kris Kennaway <kris(at)obsecurity(dot)org>
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 01:34:54
Message-ID:	26796.1144028094@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Marc G. Fournier" <scrappy(at)postgresql(dot)org> writes:
> On Sun, 2 Apr 2006, Tom Lane wrote:
>> OK, could we see strace (or whatever BSD calls it) output for the second
>> postmaster? I'd like to see exactly what results it's getting for the
>> kernel calls it makes during IpcSemaphoreCreate.

> 'k, dont' know what strace is ... we have ktrace and truss ... truss is
> what I usually use, and is:

truss seems to have an output format closer to what I'm used to, but
either will do.

regards, tom lane

From:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)freebsd(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 01:53:01
Message-ID:	20060402225215.I947@ganymede.hub.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Sent offlist ...

On Sun, 2 Apr 2006, Tom Lane wrote:

> "Marc G. Fournier" <scrappy(at)postgresql(dot)org> writes:
>> On Sun, 2 Apr 2006, Tom Lane wrote:
>>> OK, could we see strace (or whatever BSD calls it) output for the second
>>> postmaster? I'd like to see exactly what results it's getting for the
>>> kernel calls it makes during IpcSemaphoreCreate.
>
>> 'k, dont' know what strace is ... we have ktrace and truss ... truss is
>> what I usually use, and is:
>
> truss seems to have an output format closer to what I'm used to, but
> either will do.
>
> regards, tom lane
> _______________________________________________
> freebsd-stable(at)freebsd(dot)org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe(at)freebsd(dot)org"
>
>

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc:	pgsql-hackers(at)postgresql(dot)org, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)freebsd(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 03:08:11
Message-ID:	27417.1144033691@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Marc G. Fournier" <scrappy(at)postgresql(dot)org> writes:
> 'k, try this one ... looks better, actually has semget() calls in it :)

OK, here's our problem:

84250: semget(0x52e2c1,0x11,0x780) ERR#17 'File exists'

This is InternalIpcSemaphoreCreate failing because of key collision.
As it should.

84250: semget(0x52e2c1,0x11,0x0) = 1114112 (0x110000)

This is IpcSemaphoreCreate trying to see what's up. OK.

84250: __semctl(0x110000,0x10,0x5,0x0) = 537 (0x219)

IpcSemaphoreGetValue indicates it has the right "magic number" to be
a Postgres semaphore set. Still expected.

84250: __semctl(0x110000,0x10,0x4,0x0) = 83699 (0x146f3)

IpcSemaphoreGetLastPID says the sema set is last touched by pid 83699.
Looks reasonable (but do you want to double check that that matched the
first postmaster's PID?)

84250: getpid() = 84250 (0x1491a)

our pid ... as expected ...

84250: kill(0x146f3,0x0) ERR#3 'No such process'

Oops. Here is the problem: kill() is lying by claiming there is no such
process as 83699. It looks to me like there in fact is such a process,
but it's in a different jail.

I venture that FBSD 6 has decided to return ESRCH (no such process)
where FBSD 4 returned some other error that acknowledged that the
process did exist (EPERM would be a reasonable guess).

If this is the story, then FBSD have broken their system and must revert
their change. They do not have kernel behavior that totally hides the
existence of the other process, and therefore having some calls that
pretend it's not there is simply inconsistent.

regards, tom lane

From:	Kris Kennaway <kris(at)obsecurity(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)freebsd(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 03:11:57
Message-ID:	20060403031157.GA57914@xor.obsecurity.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, Apr 02, 2006 at 11:08:11PM -0400, Tom Lane wrote:

> I venture that FBSD 6 has decided to return ESRCH (no such process)
> where FBSD 4 returned some other error that acknowledged that the
> process did exist (EPERM would be a reasonable guess).
>
> If this is the story, then FBSD have broken their system and must revert
> their change. They do not have kernel behavior that totally hides the
> existence of the other process, and therefore having some calls that
> pretend it's not there is simply inconsistent.

I'm guessing it's a deliberate change to prevent the information
leakage between jails.

Kris

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Kris Kennaway <kris(at)obsecurity(dot)org>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org, freebsd-stable(at)freebsd(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 03:17:49
Message-ID:	27515.1144034269@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kris Kennaway <kris(at)obsecurity(dot)org> writes:
> On Sun, Apr 02, 2006 at 11:08:11PM -0400, Tom Lane wrote:
>> If this is the story, then FBSD have broken their system and must revert
>> their change. They do not have kernel behavior that totally hides the
>> existence of the other process, and therefore having some calls that
>> pretend it's not there is simply inconsistent.

> I'm guessing it's a deliberate change to prevent the information
> leakage between jails.

I have no objection to doing that, so long as you are actually doing it
correctly. This example shows that each jail must have its own SysV
semaphore key space, else information leaks anyway. The current
situation breaks Postgres, and therefore I suggest reverting the errno
change until you are prepared to fix the SysV IPC stuff to be per-jail.

regards, tom lane

From:	Kris Kennaway <kris(at)obsecurity(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Kris Kennaway <kris(at)obsecurity(dot)org>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org, freebsd-stable(at)freebsd(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 03:21:30
Message-ID:	20060403032130.GA58053@xor.obsecurity.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, Apr 02, 2006 at 11:17:49PM -0400, Tom Lane wrote:
> Kris Kennaway <kris(at)obsecurity(dot)org> writes:
> > On Sun, Apr 02, 2006 at 11:08:11PM -0400, Tom Lane wrote:
> >> If this is the story, then FBSD have broken their system and must revert
> >> their change. They do not have kernel behavior that totally hides the
> >> existence of the other process, and therefore having some calls that
> >> pretend it's not there is simply inconsistent.
>
> > I'm guessing it's a deliberate change to prevent the information
> > leakage between jails.
>
> I have no objection to doing that, so long as you are actually doing it
> correctly. This example shows that each jail must have its own SysV
> semaphore key space, else information leaks anyway.

By default SysV shared memory is disallowed in jails.

Kris

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Kris Kennaway <kris(at)obsecurity(dot)org>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org, freebsd-stable(at)freebsd(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 03:26:52
Message-ID:	27571.1144034812@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kris Kennaway <kris(at)obsecurity(dot)org> writes:
> On Sun, Apr 02, 2006 at 11:17:49PM -0400, Tom Lane wrote:
>> I have no objection to doing that, so long as you are actually doing it
>> correctly. This example shows that each jail must have its own SysV
>> semaphore key space, else information leaks anyway.

> By default SysV shared memory is disallowed in jails.

Hm, the present problem seems to be about semaphores not shared memory
... although I'd not be surprised to find that there's a similar issue
around shared memory. Anyway, if FBSD's position is that they are
uninterested in supporting SysV IPC in connection with jails, then I
think the Postgres project position has to be that we are uninterested
in supporting Postgres inside FBSD jails. Sorry Marc :-(

regards, tom lane

From:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To:	Kris Kennaway <kris(at)obsecurity(dot)org>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org, freebsd-stable(at)freebsd(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 03:30:58
Message-ID:	20060403002830.W947@ganymede.hub.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, 2 Apr 2006, Kris Kennaway wrote:

> On Sun, Apr 02, 2006 at 11:17:49PM -0400, Tom Lane wrote:
>> Kris Kennaway <kris(at)obsecurity(dot)org> writes:
>>> On Sun, Apr 02, 2006 at 11:08:11PM -0400, Tom Lane wrote:
>>>> If this is the story, then FBSD have broken their system and must revert
>>>> their change. They do not have kernel behavior that totally hides the
>>>> existence of the other process, and therefore having some calls that
>>>> pretend it's not there is simply inconsistent.
>>
>>> I'm guessing it's a deliberate change to prevent the information
>>> leakage between jails.
>>
>> I have no objection to doing that, so long as you are actually doing it
>> correctly. This example shows that each jail must have its own SysV
>> semaphore key space, else information leaks anyway.
>
> By default SysV shared memory is disallowed in jails.

'k, but how do I fix kill so that it has the proper behaviour if SysV is
enabled? Maybe a mount option for procfs that allows for pre-5.x
behaviour? I'm not the first one to point out that this is a problem, just
the first to follow it through to the cause ;( And I believe there is
more then just PostgreSQL that is affected by shared memory (ie. apache2
needs SysV IPC enabled, so anyone doing that in a jail has it enabled
also) ...

Basically, I don't care if 'default' is ultra-secure ... but some means to
bring it down a notch would be nice ... :(

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664

From:	Kris Kennaway <kris(at)obsecurity(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Kris Kennaway <kris(at)obsecurity(dot)org>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org, freebsd-stable(at)freebsd(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 03:31:47
Message-ID:	20060403033146.GA58254@xor.obsecurity.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, Apr 02, 2006 at 11:26:52PM -0400, Tom Lane wrote:
> Kris Kennaway <kris(at)obsecurity(dot)org> writes:
> > On Sun, Apr 02, 2006 at 11:17:49PM -0400, Tom Lane wrote:
> >> I have no objection to doing that, so long as you are actually doing it
> >> correctly. This example shows that each jail must have its own SysV
> >> semaphore key space, else information leaks anyway.
>
> > By default SysV shared memory is disallowed in jails.
>
> Hm, the present problem seems to be about semaphores not shared memory

Sorry, I meant IPC.

> ... although I'd not be surprised to find that there's a similar issue
> around shared memory. Anyway, if FBSD's position is that they are
> uninterested in supporting SysV IPC in connection with jails, then I
> think the Postgres project position has to be that we are uninterested
> in supporting Postgres inside FBSD jails.

No-one is taking a position of being "uninterested", so please don't
be hasty to reciprocate.

Kris

From:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To:	Kris Kennaway <kris(at)obsecurity(dot)org>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org, freebsd-stable(at)freebsd(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 03:38:23
Message-ID:	20060403003619.L947@ganymede.hub.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, 2 Apr 2006, Kris Kennaway wrote:

> No-one is taking a position of being "uninterested", so please don't
> be hasty to reciprocate.

I just posted it off the -hackers list, but there is an ancient patch in
the FreeBSD queue for implementing Private IPCs for Jails ... not sure why
it was never committed, or what is involved in bring it up to speed with
the current 6.x and / or -current kernels though ... but, as I mentioned
in another thread, I know that *at least* Apache2 makes use of shared
memory for some of its stuff ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664

From:	Kris Kennaway <kris(at)obsecurity(dot)org>
To:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc:	Kris Kennaway <kris(at)obsecurity(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, freebsd-stable(at)freebsd(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 03:41:01
Message-ID:	20060403034101.GA58429@xor.obsecurity.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Apr 03, 2006 at 12:30:58AM -0300, Marc G. Fournier wrote:
> On Sun, 2 Apr 2006, Kris Kennaway wrote:
>
> >On Sun, Apr 02, 2006 at 11:17:49PM -0400, Tom Lane wrote:
> >>Kris Kennaway <kris(at)obsecurity(dot)org> writes:
> >>>On Sun, Apr 02, 2006 at 11:08:11PM -0400, Tom Lane wrote:
> >>>>If this is the story, then FBSD have broken their system and must revert
> >>>>their change. They do not have kernel behavior that totally hides the
> >>>>existence of the other process, and therefore having some calls that
> >>>>pretend it's not there is simply inconsistent.
> >>
> >>>I'm guessing it's a deliberate change to prevent the information
> >>>leakage between jails.
> >>
> >>I have no objection to doing that, so long as you are actually doing it
> >>correctly. This example shows that each jail must have its own SysV
> >>semaphore key space, else information leaks anyway.
> >
> >By default SysV shared memory is disallowed in jails.
>
> 'k, but how do I fix kill so that it has the proper behaviour if SysV is
> enabled?

Check the source, perhaps there's already a way. If not, talk to
whoever made the change.

> Maybe a mount option for procfs that allows for pre-5.x
> behaviour?

procfs has nothing to do with this though.

> I'm not the first one to point out that this is a problem, just
> the first to follow it through to the cause ;( And I believe there is
> more then just PostgreSQL that is affected by shared memory (ie. apache2
> needs SysV IPC enabled, so anyone doing that in a jail has it enabled
> also) ...

Also note that SysV IPC is not the problem here, it's the change in
the behaviour of kill() that is causing postgresql to become confused.
That's what you should investigate.

Kris

From:	Andrew Thompson <thompsa(at)freebsd(dot)org>
To:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc:	Kris Kennaway <kris(at)obsecurity(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, freebsd-stable(at)freebsd(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 03:59:11
Message-ID:	20060403035911.GA76193@heff.fud.org.nz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, Apr 02, 2006 at 11:41:01PM -0400, Kris Kennaway wrote:
> On Mon, Apr 03, 2006 at 12:30:58AM -0300, Marc G. Fournier wrote:
> > 'k, but how do I fix kill so that it has the proper behaviour if SysV is
> > enabled?
>
> Check the source, perhaps there's already a way. If not, talk to
> whoever made the change.
>
> > Maybe a mount option for procfs that allows for pre-5.x
> > behaviour?
>
> procfs has nothing to do with this though.
>
> > I'm not the first one to point out that this is a problem, just
> > the first to follow it through to the cause ;( And I believe there is
> > more then just PostgreSQL that is affected by shared memory (ie. apache2
> > needs SysV IPC enabled, so anyone doing that in a jail has it enabled
> > also) ...
>
> Also note that SysV IPC is not the problem here, it's the change in
> the behaviour of kill() that is causing postgresql to become confused.
> That's what you should investigate.

The ESRCH error is being returned from prison_check(), that would be a
good starting place.

Andrew

From:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To:
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 04:24:33
Message-ID:	20060403012403.T947@ganymede.hub.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Thanks all ... have moved this to just the freebsd-stable list, since I
don't imagine most here are interested in FreeBSD :(

On Mon, 3 Apr 2006, Andrew Thompson wrote:

> On Sun, Apr 02, 2006 at 11:41:01PM -0400, Kris Kennaway wrote:
>> On Mon, Apr 03, 2006 at 12:30:58AM -0300, Marc G. Fournier wrote:
>>> 'k, but how do I fix kill so that it has the proper behaviour if SysV is
>>> enabled?
>>
>> Check the source, perhaps there's already a way. If not, talk to
>> whoever made the change.
>>
>>> Maybe a mount option for procfs that allows for pre-5.x
>>> behaviour?
>>
>> procfs has nothing to do with this though.
>>
>>> I'm not the first one to point out that this is a problem, just
>>> the first to follow it through to the cause ;( And I believe there is
>>> more then just PostgreSQL that is affected by shared memory (ie. apache2
>>> needs SysV IPC enabled, so anyone doing that in a jail has it enabled
>>> also) ...
>>
>> Also note that SysV IPC is not the problem here, it's the change in
>> the behaviour of kill() that is causing postgresql to become confused.
>> That's what you should investigate.
>
> The ESRCH error is being returned from prison_check(), that would be a
> good starting place.
>
>
> Andrew
> _______________________________________________
> freebsd-stable(at)freebsd(dot)org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe(at)freebsd(dot)org"
>
>

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664

From:	Robert Watson <rwatson(at)FreeBSD(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)freebsd(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 15:49:52
Message-ID:	20060403164139.D36756@fledge.watson.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, 2 Apr 2006, Tom Lane wrote:

> Oops. Here is the problem: kill() is lying by claiming there is no such
> process as 83699. It looks to me like there in fact is such a process, but
> it's in a different jail.
>
> I venture that FBSD 6 has decided to return ESRCH (no such process) where
> FBSD 4 returned some other error that acknowledged that the process did
> exist (EPERM would be a reasonable guess).
>
> If this is the story, then FBSD have broken their system and must revert
> their change. They do not have kernel behavior that totally hides the
> existence of the other process, and therefore having some calls that pretend
> it's not there is simply inconsistent.

FreeBSD's mandatory access control models, such as multi-level security, biba
integrity, and type enforcement, will generally provide consistent protection
under the circumstances you describe: specifically, that information flow
invariants across IPC types, including System V IPC and inter-process
signalling, will allow flow only in keeping with the policy.

However, I guess I would counter with the following concern: the PID returned
by semctl() has the following definition:

GETPID Return the pid of the last process to perform an operation
on semaphore number semnum.

However, pid's in general uniquely identify a process only at the time they
are recorded. So any pid returned here is necessarily stale -- even if there
is another process with the pid returned by GETPID, it may actually be a
different process that has ended up with the same pid. The longer the gap
since the last semaphore operation, the more likely (presumably) it is that
the pid has been recycled. And on modern systems with thousands of processes
and high process turn-over (i.e., systems with CGI and other sorts of
scripting),pid reuse can happen quickly. Is your use of the pid here
consistent with fact that pid's are reused quickly after process exit? Use of
pid's in UNIX is often unreliable, and must be combined with other
synchronizing, such as file locking on a pidfile, to ensure that the pid read
is valid. Even then, you can't implement atomic check-pid-and-signal using
current UNIX APIs, which would require a notion of a process handle (or, in
the parlance of Mach, a task port).

Another thought along these lines -- especially with the proliferation of
fine-grained access control systems, such as Type Enforcement in SELinux, I
would be cautious about assuming that two processes being able to manipulate
the same sempahore implies the ability to exchange signals using the signal
facility.

Robert N M Watson

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Watson <rwatson(at)FreeBSD(dot)org>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 16:37:04
Message-ID:	14654.1144082224@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Watson <rwatson(at)FreeBSD(dot)org> writes:
> However, pid's in general uniquely identify a process only at the time they
> are recorded. So any pid returned here is necessarily stale -- even if there
> is another process with the pid returned by GETPID, it may actually be a
> different process that has ended up with the same pid. The longer the gap
> since the last semaphore operation, the more likely (presumably) it is that
> the pid has been recycled. And on modern systems with thousands of processes
> and high process turn-over (i.e., systems with CGI and other sorts of
> scripting),pid reuse can happen quickly. Is your use of the pid here
> consistent with fact that pid's are reused quickly after process exit?

That's a fair question, but in the context of the code I believe we are
behaving reasonably. The reason this code exists is to provide some
insurance against leaking semaphores when a postmaster process is
terminated unexpectedly (ye olde often-recommended-against "kill -9
postmaster", for instance). If the PID returned by GETPID is
nonexistent or belongs to a process not owned by the postgres userid
then we assume that the semaphore set can be recycled. We could get
fooled by PID recycling if the PID returned by GETPID belongs to a
postgres-owned process that isn't actually the original owner, but
the penalty is just that we'll fail to recycle semaphores that could
be released. Not very harmful, and not very probable either, unless
you're running postgres under a userid that's used for a lot of other
stuff too. There is not much risk of long-term leakage of many
semaphore sets, even if you've got lots of postmaster crashes going on
(which I sure hope you don't). The code is designed to retry the same
semaphore keys on each cycle of life, so you'd have to get fooled by
chance coincidence of existing PIDs every time over many cycles to
have a severe resource-leakage problem. (BTW, Marc, that's the reason
for *not* randomizing the key selection as you suggested.)

So I think the code is pretty bulletproof as long as it's in a system
that is behaving per SysV spec. The problem in the current FBSD
situation is that the jail mechanism is exposing semaphore sets across
jails, but not exposing the existence of the owning processes. That
behavior is inconsistent: if process A can affect the state of a sema
set that process B can see, it's surely unreasonable to pretend that A
doesn't exist.

regards, tom lane

From:	Robert Watson <rwatson(at)FreeBSD(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 16:49:42
Message-ID:	20060403174043.S76562@fledge.watson.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 3 Apr 2006, Tom Lane wrote:

> That's a fair question, but in the context of the code I believe we are
> behaving reasonably. The reason this code exists is to provide some
> insurance against leaking semaphores when a postmaster process is terminated
> unexpectedly (ye olde often-recommended-against "kill -9 postmaster", for
> instance). If the PID returned by GETPID is nonexistent or belongs to a
> process not owned by the postgres userid then we assume that the semaphore
> set can be recycled. We could get fooled by PID recycling if the PID
> returned by GETPID belongs to a postgres-owned process that isn't actually
> the original owner, but the penalty is just that we'll fail to recycle
> semaphores that could be released. Not very harmful, and not very probable
> either, unless you're running postgres under a userid that's used for a lot
> of other stuff too. There is not much risk of long-term leakage of many
> semaphore sets, even if you've got lots of postmaster crashes going on
> (which I sure hope you don't). The code is designed to retry the same
> semaphore keys on each cycle of life, so you'd have to get fooled by chance
> coincidence of existing PIDs every time over many cycles to have a severe
> resource-leakage problem. (BTW, Marc, that's the reason for *not*
> randomizing the key selection as you suggested.)
>
> So I think the code is pretty bulletproof as long as it's in a system that
> is behaving per SysV spec. The problem in the current FBSD situation is
> that the jail mechanism is exposing semaphore sets across jails, but not
> exposing the existence of the owning processes. That behavior is
> inconsistent: if process A can affect the state of a sema set that process B
> can see, it's surely unreasonable to pretend that A doesn't exist.

Maybe I've misunderstood the problem here -- is the use of the GETPID
operation occuring within a coordinated set of server processes, or does it
also occur between client and server processes? I think it's quite reasonable
to argue that a coordinated set of server processes should be able to see each
other, especially if they're running as the same user, in the same jail,
started at the same time. After all, coordinated server applications
frequently use signals to manage resources and perform asynchronous
notification (i.e., SIGCHLD, SIGHUP, etc). If we're talking about clients and
servers coordinating using the same System V IPC name space, I find myself
less sympathetic to the idea that otherwise unrelated processes on either side
of the IPC mechanism should be using out-of-band process operations to test
for mutual presence.

There has been occasional investigation of virtualizing the System V IPC name
space, but as you are no doubt aware, the name space doesn't lend itself to
virtualization, as it fails to be conveniently hierarchical, etc. This is
just another of the ways in which System V IPC offers quite useful IPC
services in less useful ways. I would, in general, consider the use of System
V IPC across jails (as opposed to in a single jail) unsupported, since it's
not consistent with the security model. However, I have doubts about the
behavioral dependency we're talking about above.

Robert N M Watson

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Watson <rwatson(at)FreeBSD(dot)org>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 17:07:39
Message-ID:	14905.1144084059@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Watson <rwatson(at)FreeBSD(dot)org> writes:
> Maybe I've misunderstood the problem here -- is the use of the GETPID
> operation occuring within a coordinated set of server processes, or does it
> also occur between client and server processes? I think it's quite reasonable
> to argue that a coordinated set of server processes should be able to see each
> other, especially if they're running as the same user, in the same jail,
> started at the same time.

We use the semaphore sets only within postgres server processes; we
could hardly expect client processes to be able to get at them, since
in general clients aren't on the same machine. The issue here, though,
is that Marc is trying to start multiple postgres servers in different
jails, and in that context the different postgres servers aren't
"coordinated" in any real sense. We'd prefer that they didn't interact
at all, but they are interacting because the SysV code isn't restricting
IPC to occur only within a jail.

BTW, Marc, it occurs to me that a workaround for you would be to create
a separate userid for postgres to run under in each jail; then the
regular protection mechanisms would prevent the different postmasters
from interfering with each others' semaphore sets. But I think that
workaround just makes it even clearer that the jail mechanism isn't
behaving very sanely.

> I would, in general, consider the use of System
> V IPC across jails (as opposed to in a single jail) unsupported, since it's
> not consistent with the security model.

That'd be fine with me --- the problem here is that we've got unwanted
communication across jails. If, say, the jail ID were considered part
of semaphore keys, we'd be in fine shape.

regards, tom lane

From:	Robert Watson <rwatson(at)FreeBSD(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 17:22:42
Message-ID:	20060403181621.P76562@fledge.watson.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 3 Apr 2006, Tom Lane wrote:

> Robert Watson <rwatson(at)FreeBSD(dot)org> writes:
>> Maybe I've misunderstood the problem here -- is the use of the GETPID
>> operation occuring within a coordinated set of server processes, or does it
>> also occur between client and server processes? I think it's quite reasonable
>> to argue that a coordinated set of server processes should be able to see each
>> other, especially if they're running as the same user, in the same jail,
>> started at the same time.
>
> We use the semaphore sets only within postgres server processes; we could
> hardly expect client processes to be able to get at them, since in general
> clients aren't on the same machine. The issue here, though, is that Marc is
> trying to start multiple postgres servers in different jails, and in that
> context the different postgres servers aren't "coordinated" in any real
> sense. We'd prefer that they didn't interact at all, but they are
> interacting because the SysV code isn't restricting IPC to occur only within
> a jail.
>
> BTW, Marc, it occurs to me that a workaround for you would be to create a
> separate userid for postgres to run under in each jail; then the regular
> protection mechanisms would prevent the different postmasters from
> interfering with each others' semaphore sets. But I think that workaround
> just makes it even clearer that the jail mechanism isn't behaving very
> sanely.

Any multi-instance application that uses unvirtualized System V IPC must know
how to distinguish between those instances. This is true of any potential
communication mechanism used by multi-instance applications -- be it a command
line argument to specify an alternative configuration file, or a configuration
file that specifies alternative ports, working directories, mail spool
directories, etc. If you install two instances of sendmail, it requires some
configuration to teach them not to step all over each other, and this is not
an accident: if they try to use the same mail spools, ports, etc, things will
go badly. I can't imagine that PostgreSQL should be any different -- it has
to be pointed at what resources to use and how to use them -- some of that
will be a property of how it's written, and some how it's configured.
Presumably, running multiple instances of PostgreSQL in jails should not be
all that different from running multiple instances on any UNIX machine: they
must not overlap where shared resources are concerned.

How is PostgreSQL deciding what semaphores to use? Can it be instructed to
use non-colliding ones by specifying an alternative argument to pass to
ftok(), or ID to use directly?

>> I would, in general, consider the use of System V IPC across jails (as
>> opposed to in a single jail) unsupported, since it's not consistent with
>> the security model.
>
> That'd be fine with me --- the problem here is that we've got unwanted
> communication across jails. If, say, the jail ID were considered part of
> semaphore keys, we'd be in fine shape.

Well, I think it's definitely unwanted communications, but until such time as
FreeBSD supports virtualizing the System V IPC name spaces, the fact that you
can communicate between jails when System V IPC support is turned on for the
jail shouldn't be a surprise, and should in fact be considered a feature.
However, if applications behave incorrectly when treading over each other
because either they aren't written to support specifying how not to walk over
each other, or if they are not configured to use that support, then they're
not going to behave well :-).

Robert N M Watson

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Watson <rwatson(at)FreeBSD(dot)org>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 17:52:33
Message-ID:	15174.1144086753@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Watson <rwatson(at)FreeBSD(dot)org> writes:
> Any multi-instance application that uses unvirtualized System V IPC must know
> how to distinguish between those instances.

Sure.

> How is PostgreSQL deciding what semaphores to use? Can it be instructed to
> use non-colliding ones by specifying an alternative argument to pass to
> ftok(), or ID to use directly?

The problem here is not that we don't know how to avoid a collision.
The problem is stemming from code that we added to prevent semaphore
leakage during failure recoveries. The code believes that it is
deleting a semaphore set left over from a crashed previous instance
of the same postmaster.

We don't use ftok() to determine the keys, and I'm disinclined to think
that doing so would improve the situation: you could still have key
collisions, they'd just be unpredictable and there'd be no convenient
mechanism for escaping one if you hit it.

> However, if applications behave incorrectly when treading over each other
> because either they aren't written to support specifying how not to walk over
> each other, or if they are not configured to use that support, then they're
> not going to behave well :-).

Postgres is absolutely designed not to walk all over itself. It is,
however, designed to clean up after itself, and I don't consider that a
bug. The problem here is that by redefining the behavior of kill, you've
prevented the code from detecting the existence of the other postmaster,
and thereby triggered the cleanup behavior.

I don't exactly see why it's considered such a critical security feature
that kill return ESRCH rather than, say, EPERM for processes in another
jail. kill won't tell you what that process is or what it's doing,
so the amount of information leaked is certainly pretty trivial. It'd
be fine if FBSD actually had a jail implementation that leaked zero
information, but you don't --- in fact, you're saying it's a feature
that you don't.

Perhaps a reasonable compromise would be to have the
SysV-IPC-allowed-in-jails switch also restore the normal return value
of kill(). This seems sensible to me because the GETPID feature makes
PIDs be part of the API that is exposed across jails.

regards, tom lane

From:	Vivek Khera <vivek(at)khera(dot)org>
To:	freebsd-stable <freebsd-stable(at)freebsd(dot)org>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 18:22:23
Message-ID:	A1072D0B-7416-493C-8CCC-C9126134A9B3@khera.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Apr 3, 2006, at 12:37 PM, Tom Lane wrote:

> semaphore keys on each cycle of life, so you'd have to get fooled by
> chance coincidence of existing PIDs every time over many cycles to
> have a severe resource-leakage problem. (BTW, Marc, that's the reason
> for *not* randomizing the key selection as you suggested.)

Seems to me the way around this with minimal fuss is to add a flag to
postgres to have it start at different points in the ID sequence.
So pg#1 would start at first position, pg#2 second ID position, etc.
then just hard-code an "instance ID" into the startup script for each
pg. No randomization make it easier to debug, and unique IDs make it
avoid clashes under normal cases.

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Watson <rwatson(at)FreeBSD(dot)org>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 19:42:51
Message-ID:	20060403194251.GF4474@ns.snowman.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> That's a fair question, but in the context of the code I believe we are
> behaving reasonably. The reason this code exists is to provide some
> insurance against leaking semaphores when a postmaster process is
> terminated unexpectedly (ye olde often-recommended-against "kill -9
> postmaster", for instance). If the PID returned by GETPID is

Could this be handled sensibly by using SEM_UNDO? Just a thought.

> So I think the code is pretty bulletproof as long as it's in a system
> that is behaving per SysV spec. The problem in the current FBSD
> situation is that the jail mechanism is exposing semaphore sets across
> jails, but not exposing the existence of the owning processes. That
> behavior is inconsistent: if process A can affect the state of a sema
> set that process B can see, it's surely unreasonable to pretend that A
> doesn't exist.

This is certainly a problem with FBSD jails... Not only the
inconsistancy, but what happens if someone manages to get access to the
appropriate uid under one jail and starts sniffing or messing with the
semaphores or shared memory segments from other jails? If that's
possible then that's a rather glaring security problem...

Thanks,

Stephen

From:	Kris Kennaway <kris(at)obsecurity(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Watson <rwatson(at)FreeBSD(dot)org>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 19:45:24
Message-ID:	20060403194524.GA58237@xor.obsecurity.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Apr 03, 2006 at 03:42:51PM -0400, Stephen Frost wrote:
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> > That's a fair question, but in the context of the code I believe we are
> > behaving reasonably. The reason this code exists is to provide some
> > insurance against leaking semaphores when a postmaster process is
> > terminated unexpectedly (ye olde often-recommended-against "kill -9
> > postmaster", for instance). If the PID returned by GETPID is
>
> Could this be handled sensibly by using SEM_UNDO? Just a thought.
>
> > So I think the code is pretty bulletproof as long as it's in a system
> > that is behaving per SysV spec. The problem in the current FBSD
> > situation is that the jail mechanism is exposing semaphore sets across
> > jails, but not exposing the existence of the owning processes. That
> > behavior is inconsistent: if process A can affect the state of a sema
> > set that process B can see, it's surely unreasonable to pretend that A
> > doesn't exist.
>
> This is certainly a problem with FBSD jails... Not only the
> inconsistancy, but what happens if someone manages to get access to the
> appropriate uid under one jail and starts sniffing or messing with the
> semaphores or shared memory segments from other jails? If that's
> possible then that's a rather glaring security problem...

This was stated already upthread, but sysv IPC is disabled by default
in jails for precisely this reason. So yes, when you turn it on it's
a potential security problem if your jails are supposed to be
compartmentalized.

Kris

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Watson <rwatson(at)FreeBSD(dot)org>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 19:50:18
Message-ID:	20060403195018.GG4474@ns.snowman.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> BTW, Marc, it occurs to me that a workaround for you would be to create
> a separate userid for postgres to run under in each jail; then the
> regular protection mechanisms would prevent the different postmasters
> from interfering with each others' semaphore sets. But I think that
> workaround just makes it even clearer that the jail mechanism isn't
> behaving very sanely.

Just to toss it in there, I do this on some systems where we use Linux
VServers. It's just so that when I'm looking at a process list across
the whole system it's easy to tell which processes are inside which
vservers (since the only thing which should be running in a given
vserver is a single Postgres instance which should only be running with
the uid/gid corresponding to that vserver, and that uid/gid is recorded
in the host passwd file with a name associated with it since that's the
passwd file used when looking at all pids).

I also just double-checked with the Linux VServer folks and they confirm
that IPC inside the vserver are isolated from all the other IPCs on the
system.

Thanks,

Stephen

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	Robert Watson <rwatson(at)FreeBSD(dot)org>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 19:57:43
Message-ID:	16158.1144094263@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> Could this be handled sensibly by using SEM_UNDO? Just a thought.

Interesting thought, but I think it doesn't work for the special case
where the semaphore's "previous owner" is actually our same PID ---
which is actually the more commonly exercised path, since true
postmaster crashes are pretty rare. More commonly we're trying to
detach from and recreate our own shmem and semas following a backend
crash. We can special-case that pretty easily with the GETPID solution
(pid == ours is obviously not some other process's sema), but with
SEM_UNDO it wouldn't work right.

I'm also concerned about the portability risks of depending on SEM_UNDO.
I think a lot of systems set the semaphore-undo limits pretty small,
maybe even zero.

BTW, as long as we're annoying the freebsd-stable list with discussions
of workarounds, I'm wondering whether our shared memory code might have
similar risks. Does FBSD 6 also lie about the existence of other-jail
processes connected to a shared memory segment --- ie, in
shmctl(IPC_STAT)'s result, does shm_nattch count only processes in our
own jail?

regards, tom lane

From:	Robert Watson <rwatson(at)FreeBSD(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 22:25:39
Message-ID:	20060403231143.O76562@fledge.watson.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 3 Apr 2006, Tom Lane wrote:

> Robert Watson <rwatson(at)FreeBSD(dot)org> writes:
>> Any multi-instance application that uses unvirtualized System V IPC must know
>> how to distinguish between those instances.
>
> Sure.
>
>> How is PostgreSQL deciding what semaphores to use? Can it be instructed to
>> use non-colliding ones by specifying an alternative argument to pass to
>> ftok(), or ID to use directly?
>
> The problem here is not that we don't know how to avoid a collision. The
> problem is stemming from code that we added to prevent semaphore leakage
> during failure recoveries. The code believes that it is deleting a
> semaphore set left over from a crashed previous instance of the same
> postmaster.
>
> We don't use ftok() to determine the keys, and I'm disinclined to think that
> doing so would improve the situation: you could still have key collisions,
> they'd just be unpredictable and there'd be no convenient mechanism for
> escaping one if you hit it.

I guess what I'm saying is this: by turning on system V IPC in a jail,
administrators accept that they are using an unsupported configuration, in
which the security features of jail, which include hiding the process state of
other jails, are known to conflict with the System V IPC services. We
specifically disable System V IPC in jails because it is known to have
undesirable properties. When configuring systems in that state, the
responsibility falls on the administrator to disambiguate the configuration by
specifying which resources must be used in order to prevent a conflict,
because software operating in that environment will not be able to do so
properly. The goal of the switch to enable System V IPC is to allow IPC to be
enabled for a single jail at a time, where it can be used to its full
capabilities, without violating the security model. If it is turned on for
more than one jail, then isolation is not provided for System V IPC.

So my recommendation is, if people want to run Postgres in more than one jail
at a time, they be provided with a configuration option to disambiguate which
semaphore to use: they must hard-code that it will not use the same sempahore
already in use by another Postgres instance in another Jail. This is no
different than specifying that if there are multiple Apache's running on a
single system, that they run on different port/IP combinations. If they
aren't configured to do so, one of them will encounter an error when running,
because the resource is already in use, and you may get unpredictable results
if the two Apaches are started at the same time, restarted, etc, as they will
race to acquire the resource.

Whether you pull the resource ID out of a hat, use ftok(), or whatever, I
really mind, and have no strong opinion. The name space of System V IPC is
one of the known problems with the IPC model, and sadly, one accepts those
problems by using those IPC mechanisms.

Robert N M Watson

From:	Robert Watson <rwatson(at)FreeBSD(dot)org>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 22:37:33
Message-ID:	20060403233540.D76562@fledge.watson.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 3 Apr 2006, Stephen Frost wrote:

>> So I think the code is pretty bulletproof as long as it's in a system that
>> is behaving per SysV spec. The problem in the current FBSD situation is
>> that the jail mechanism is exposing semaphore sets across jails, but not
>> exposing the existence of the owning processes. That behavior is
>> inconsistent: if process A can affect the state of a sema set that process
>> B can see, it's surely unreasonable to pretend that A doesn't exist.
>
> This is certainly a problem with FBSD jails... Not only the inconsistancy,
> but what happens if someone manages to get access to the appropriate uid
> under one jail and starts sniffing or messing with the semaphores or shared
> memory segments from other jails? If that's possible then that's a rather
> glaring security problem...

This is why it's disabled by default, and the jail documentation specifically
advises of this possibility. Excerpt below.

Robert N M Watson

security.jail.sysvipc_allowed
This MIB entry determines whether or not processes within a jail
have access to System V IPC primitives. In the current jail imple-
mentation, System V primitives share a single namespace across the
host and jail environments, meaning that processes within a jail
would be able to communicate with (and potentially interfere with)
processes outside of the jail, and in other jails. As such, this
functionality is disabled by default, but can be enabled by setting
this MIB entry to 1.

From:	Robert Watson <rwatson(at)FreeBSD(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 22:40:51
Message-ID:	20060403233826.Q76562@fledge.watson.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 3 Apr 2006, Tom Lane wrote:

> BTW, as long as we're annoying the freebsd-stable list with discussions of
> workarounds, I'm wondering whether our shared memory code might have similar
> risks. Does FBSD 6 also lie about the existence of other-jail processes
> connected to a shared memory segment --- ie, in shmctl(IPC_STAT)'s result,
> does shm_nattch count only processes in our own jail?

People are, of course, welcome to read the Jail documentation in order to read
the warning about not enabling the System V IPC support in Jails, and what the
possible results of doing so are.

Robert N M Watson

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Robert Watson <rwatson(at)FreeBSD(dot)org>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 22:51:45
Message-ID:	20060403225145.GI4474@ns.snowman.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

* Robert Watson (rwatson(at)FreeBSD(dot)org) wrote:
> On Mon, 3 Apr 2006, Stephen Frost wrote:
> >This is certainly a problem with FBSD jails... Not only the
> >inconsistancy, but what happens if someone manages to get access to the
> >appropriate uid under one jail and starts sniffing or messing with the
> >semaphores or shared memory segments from other jails? If that's possible
> >then that's a rather glaring security problem...
>
> This is why it's disabled by default, and the jail documentation
> specifically advises of this possibility. Excerpt below.

Ah, I see, glad to see it's accurately documented. Given the rather
significant use of shared memory by Postgres it seems to me that
jail'ing it under FBSD is unlikely to get you the kind of isolation
between instances that you want (the assumption being that you want to
avoid the possibility of a user under one jail impacting a user in
another jail). As such, I'd suggest finding something else if you
truely need that isolation for Postgres or dropping the jails entirely.

Running the Postgres instances under different uids (as you'd probably
expect to do anyway if not using the jails) is probably the right
approach. Doing that and using jails would probably work, just don't
delude yourself into thinking that you're safe from a malicious user in
one jail.

Thanks,

Stephen

From:	Robert Watson <rwatson(at)FreeBSD(dot)org>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org, freebsd-stable(at)FreeBSD(dot)org, Kris Kennaway <kris(at)obsecurity(dot)org>
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 22:56:13
Message-ID:	20060403235222.W76562@fledge.watson.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 3 Apr 2006, Stephen Frost wrote:

>> This is why it's disabled by default, and the jail documentation
>> specifically advises of this possibility. Excerpt below.
>
> Ah, I see, glad to see it's accurately documented.

As it has been for the last five years, I believe since introduction of the
setting to allow System V IPC to be used with documented limitations.

> Given the rather significant use of shared memory by Postgres it seems to me
> that jail'ing it under FBSD is unlikely to get you the kind of isolation
> between instances that you want (the assumption being that you want to avoid
> the possibility of a user under one jail impacting a user in another jail).
> As such, I'd suggest finding something else if you truely need that
> isolation for Postgres or dropping the jails entirely.
>
> Running the Postgres instances under different uids (as you'd probably
> expect to do anyway if not using the jails) is probably the right approach.
> Doing that and using jails would probably work, just don't delude yourself
> into thinking that you're safe from a malicious user in one jail.

Yes, there seems to be an awful lot of noise being made about the fact that
the system does, in fact, work exactly as documented, and that the
configuration being complained about is one that is specifically documented as
being unsupported and undesirable.

As commented elsewhere in this thread, currently, there is no virtualization
support for System V IPC in the FreeBSD Jail implementation. That may change
if/when someone implements it. Until it's implemented, it isn't going to be
there, and the system won't behave as though it's there no matter how much
jumping up and down is done.

Robert N M Watson

From:	Kris Kennaway <kris(at)obsecurity(dot)org>
To:	Robert Watson <rwatson(at)FreeBSD(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 22:57:12
Message-ID:	20060403225712.GA63521@xor.obsecurity.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Apr 03, 2006 at 06:51:45PM -0400, Stephen Frost wrote:
> * Robert Watson (rwatson(at)FreeBSD(dot)org) wrote:
> > On Mon, 3 Apr 2006, Stephen Frost wrote:
> > >This is certainly a problem with FBSD jails... Not only the
> > >inconsistancy, but what happens if someone manages to get access to the
> > >appropriate uid under one jail and starts sniffing or messing with the
> > >semaphores or shared memory segments from other jails? If that's possible
> > >then that's a rather glaring security problem...
> >
> > This is why it's disabled by default, and the jail documentation
> > specifically advises of this possibility. Excerpt below.
>
> Ah, I see, glad to see it's accurately documented. Given the rather
> significant use of shared memory by Postgres it seems to me that
> jail'ing it under FBSD is unlikely to get you the kind of isolation
> between instances that you want (the assumption being that you want to
> avoid the possibility of a user under one jail impacting a user in
> another jail). As such, I'd suggest finding something else if you
> truely need that isolation for Postgres or dropping the jails entirely.
>
> Running the Postgres instances under different uids (as you'd probably
> expect to do anyway if not using the jails) is probably the right
> approach. Doing that and using jails would probably work, just don't
> delude yourself into thinking that you're safe from a malicious user in
> one jail.

Yes; however jails are still useful for administrative
compartmentalization even when you have to weaken their security
properties, such as here.

Kris

From:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	Robert Watson <rwatson(at)FreeBSD(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org, freebsd-stable(at)FreeBSD(dot)org, Kris Kennaway <kris(at)obsecurity(dot)org>
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-03 23:46:32
Message-ID:	20060403204355.T947@ganymede.hub.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 3 Apr 2006, Stephen Frost wrote:

> * Robert Watson (rwatson(at)FreeBSD(dot)org) wrote:
>> On Mon, 3 Apr 2006, Stephen Frost wrote:
>>> This is certainly a problem with FBSD jails... Not only the
>>> inconsistancy, but what happens if someone manages to get access to the
>>> appropriate uid under one jail and starts sniffing or messing with the
>>> semaphores or shared memory segments from other jails? If that's possible
>>> then that's a rather glaring security problem...
>>
>> This is why it's disabled by default, and the jail documentation
>> specifically advises of this possibility. Excerpt below.
>
> Ah, I see, glad to see it's accurately documented. Given the rather
> significant use of shared memory by Postgres it seems to me that
> jail'ing it under FBSD is unlikely to get you the kind of isolation
> between instances that you want (the assumption being that you want to
> avoid the possibility of a user under one jail impacting a user in
> another jail). As such, I'd suggest finding something else if you
> truely need that isolation for Postgres or dropping the jails entirely.
>
> Running the Postgres instances under different uids (as you'd probably
> expect to do anyway if not using the jails) is probably the right
> approach. Doing that and using jails would probably work, just don't
> delude yourself into thinking that you're safe from a malicious user in
> one jail.

We don't ... we put all our databases on a central database server, even
private ones, that nobody has shell access to ... we keep them isolated
...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc:	Robert Watson <rwatson(at)FreeBSD(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, freebsd-stable(at)FreeBSD(dot)org, Kris Kennaway <kris(at)obsecurity(dot)org>
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-04 01:19:04
Message-ID:	20060404011904.GJ4474@ns.snowman.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

* Marc G. Fournier (scrappy(at)postgresql(dot)org) wrote:
> On Mon, 3 Apr 2006, Stephen Frost wrote:
> >Running the Postgres instances under different uids (as you'd probably
> >expect to do anyway if not using the jails) is probably the right
> >approach. Doing that and using jails would probably work, just don't
> >delude yourself into thinking that you're safe from a malicious user in
> >one jail.
>
> We don't ... we put all our databases on a central database server, even
> private ones, that nobody has shell access to ... we keep them isolated
> ...

I guess what I was trying to get at is this:

Running 2 Postgres instances under FreeBSD with (or without really, but
I guess that's more obvious) jails but with the same UID is a bad idea.
Even if Postgres could be modified to allow this to work you're going to
be in a position where the jail isn't really helping much except to give
a somewhat false (in this case) sense of security. We probably
shouldn't encourage it and in fact it's something of a nice feature that
it breaks.

The reasoning is pretty simple: if someone manages to get control of
one of the Postgres instances they're going to be able to wreck havoc on
the other. With different UIDs, with or without jails, this would be
much more difficult (need to get root first).

Running 2 Postgres instances under FreeBSD with jails *and* different
UIDs is *probably* better than w/o jails but since you have to enable
the single-instance IPC system it might not be that great of a benefit
over a simple chroot or similar.

Hope that helps...

Thanks,

Stephen

From:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Robert Watson <rwatson(at)FreeBSD(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, kris(at)obsecurity(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-11 19:27:33
Message-ID:	200604111927.k3BJRXH26498@candle.pha.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

[ FreeBSD email list removed.]

I totally agree, and have added the attached documentation patch to
recommend using different users in FreeBSD jails.

---------------------------------------------------------------------------

Stephen Frost wrote:
-- Start of PGP signed section.
> * Marc G. Fournier (scrappy(at)postgresql(dot)org) wrote:
> > On Mon, 3 Apr 2006, Stephen Frost wrote:
> > >Running the Postgres instances under different uids (as you'd probably
> > >expect to do anyway if not using the jails) is probably the right
> > >approach. Doing that and using jails would probably work, just don't
> > >delude yourself into thinking that you're safe from a malicious user in
> > >one jail.
> >
> > We don't ... we put all our databases on a central database server, even
> > private ones, that nobody has shell access to ... we keep them isolated
> > ...
>
> I guess what I was trying to get at is this:
>
> Running 2 Postgres instances under FreeBSD with (or without really, but
> I guess that's more obvious) jails but with the same UID is a bad idea.
> Even if Postgres could be modified to allow this to work you're going to
> be in a position where the jail isn't really helping much except to give
> a somewhat false (in this case) sense of security. We probably
> shouldn't encourage it and in fact it's something of a nice feature that
> it breaks.
>
> The reasoning is pretty simple: if someone manages to get control of
> one of the Postgres instances they're going to be able to wreck havoc on
> the other. With different UIDs, with or without jails, this would be
> much more difficult (need to get root first).
>
> Running 2 Postgres instances under FreeBSD with jails *and* different
> UIDs is *probably* better than w/o jails but since you have to enable
> the single-instance IPC system it might not be that great of a benefit
> over a simple chroot or similar.
>
> Hope that helps...
>
> Thanks,
>
> Stephen
-- End of PGP section, PGP failed!

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachment	Content-Type	Size
unknown_filename	text/plain	1.3 KB

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Robert Watson <rwatson(at)FreeBSD(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, kris(at)obsecurity(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-11 19:40:18
Message-ID:	20060411194018.GC4474@ns.snowman.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

* Bruce Momjian (pgman(at)candle(dot)pha(dot)pa(dot)us) wrote:
> <para>
> + If running in FreeBSD jails by enabling <application>sysconf</>'s
> + <literal>security.jail.sysvipc_allowed</>, <application>postmaster</>s
> + running in different jails should be run by different operating system
> + users. This improves security because it prevents one jail from
> + interfering with shared memory or semaphores in another, and it
> + allows the PostgreSQL IPC cleanup code to function properly.
> + (In FreeBSD 6.0 and later the IPC cleanup code doesn't properly detect
> + processes in other jails, preventing the running of postmasters on the
> + same port in different jails.)
> + </para>

This looks good, my only comment would be that we don't want people to
believe that using different users somehow makes the sysv spaces
seperate between the jails. It doesn't. Even when using different
uids, a user who gets root in one jail would be able to mess with the
Postgres instance in the other jail through IPC.

Perhaps change:

"This improves security because it prevents one jail from
interfering with shared memory or semaphores in another"

to:

"This improves security because it prevents the postgres user in one
jail from interfering with shared memory or semaphores owned by a
different user in another jail (with BSD jails, root, or the same
UID, in any jail can see and interfere with the shared memory and
semaphores in any other jail of the same UID, or all if root)"

That's still not great but I think it's a little better...

Thanks,

Stephen

From:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Robert Watson <rwatson(at)FreeBSD(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, kris(at)obsecurity(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-11 19:42:58
Message-ID:	200604111942.k3BJgw604841@candle.pha.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Stephen Frost wrote:
-- Start of PGP signed section.
> * Bruce Momjian (pgman(at)candle(dot)pha(dot)pa(dot)us) wrote:
> > <para>
> > + If running in FreeBSD jails by enabling <application>sysconf</>'s
> > + <literal>security.jail.sysvipc_allowed</>, <application>postmaster</>s
> > + running in different jails should be run by different operating system
> > + users. This improves security because it prevents one jail from
> > + interfering with shared memory or semaphores in another, and it
> > + allows the PostgreSQL IPC cleanup code to function properly.
> > + (In FreeBSD 6.0 and later the IPC cleanup code doesn't properly detect
> > + processes in other jails, preventing the running of postmasters on the
> > + same port in different jails.)
> > + </para>
>
> This looks good, my only comment would be that we don't want people to
> believe that using different users somehow makes the sysv spaces
> seperate between the jails. It doesn't. Even when using different
> uids, a user who gets root in one jail would be able to mess with the
> Postgres instance in the other jail through IPC.
>
> Perhaps change:
>
> "This improves security because it prevents one jail from
> interfering with shared memory or semaphores in another"
>
> to:
>
> "This improves security because it prevents the postgres user in one
> jail from interfering with shared memory or semaphores owned by a
> different user in another jail (with BSD jails, root, or the same
> UID, in any jail can see and interfere with the shared memory and
> semaphores in any other jail of the same UID, or all if root)"
>
> That's still not great but I think it's a little better...

I updated the wording to say 'non-root users':

If running in FreeBSD jails by enabling <application>sysconf</>'s
<literal>security.jail.sysvipc_allowed</>, <application>postmaster</>s
running in different jails should be run by different operating system
users. This improves security because it prevents non-root users
from interfering with shared memory or semaphores in a different jail,
and it allows the PostgreSQL IPC cleanup code to function properly.
(In FreeBSD 6.0 and later the IPC cleanup code doesn't properly detect
processes in other jails, preventing the running of postmasters on the
same port in different jails.)

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Robert Watson <rwatson(at)FreeBSD(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, kris(at)obsecurity(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-11 19:51:34
Message-ID:	20060411195134.GD4474@ns.snowman.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

* Bruce Momjian (pgman(at)candle(dot)pha(dot)pa(dot)us) wrote:
> I updated the wording to say 'non-root users':
>
> If running in FreeBSD jails by enabling <application>sysconf</>'s
> <literal>security.jail.sysvipc_allowed</>, <application>postmaster</>s
> running in different jails should be run by different operating system
> users. This improves security because it prevents non-root users
> from interfering with shared memory or semaphores in a different jail,
> and it allows the PostgreSQL IPC cleanup code to function properly.
> (In FreeBSD 6.0 and later the IPC cleanup code doesn't properly detect
> processes in other jails, preventing the running of postmasters on the
> same port in different jails.)

You're still saying it'll do something that it won't... It doesn't
prevent non-root users from messing with each other if they're the same
UID, even if they're under different jails... That's the whole problem
here. :)

Thanks,

Stephen

From:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	"Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Robert Watson <rwatson(at)FreeBSD(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, kris(at)obsecurity(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-04-11 19:56:29
Message-ID:	200604111956.k3BJuTs06846@candle.pha.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Stephen Frost wrote:
-- Start of PGP signed section.
> * Bruce Momjian (pgman(at)candle(dot)pha(dot)pa(dot)us) wrote:
> > I updated the wording to say 'non-root users':
> >
> > If running in FreeBSD jails by enabling <application>sysconf</>'s
> > <literal>security.jail.sysvipc_allowed</>, <application>postmaster</>s
> > running in different jails should be run by different operating system
> > users. This improves security because it prevents non-root users
> > from interfering with shared memory or semaphores in a different jail,
> > and it allows the PostgreSQL IPC cleanup code to function properly.
> > (In FreeBSD 6.0 and later the IPC cleanup code doesn't properly detect
> > processes in other jails, preventing the running of postmasters on the
> > same port in different jails.)
>
> You're still saying it'll do something that it won't... It doesn't
> prevent non-root users from messing with each other if they're the same
> UID, even if they're under different jails... That's the whole problem
> here. :)

Uh, the first part says use different Unix users for different jails,
then it says why to do that (security). Seems clear to me.

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

From:	Max Khon <fjoe(at)samodelkin(dot)net>
To:	Robert Watson <rwatson(at)FreeBSD(dot)org>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-05-09 11:19:24
Message-ID:	20060509111924.GD64148@samodelkin.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi!

On Mon, Apr 03, 2006 at 11:56:13PM +0100, Robert Watson wrote:

> >>This is why it's disabled by default, and the jail documentation
> >>specifically advises of this possibility. Excerpt below.
> >
> >Ah, I see, glad to see it's accurately documented.
>
> As it has been for the last five years, I believe since introduction of the
> setting to allow System V IPC to be used with documented limitations.
>
> >Given the rather significant use of shared memory by Postgres it seems to
> >me that jail'ing it under FBSD is unlikely to get you the kind of
> >isolation between instances that you want (the assumption being that you
> >want to avoid the possibility of a user under one jail impacting a user in
> >another jail). As such, I'd suggest finding something else if you truely
> >need that isolation for Postgres or dropping the jails entirely.
> >
> >Running the Postgres instances under different uids (as you'd probably
> >expect to do anyway if not using the jails) is probably the right
> >approach. Doing that and using jails would probably work, just don't
> >delude yourself into thinking that you're safe from a malicious user in
> >one jail.
>
> Yes, there seems to be an awful lot of noise being made about the fact that
> the system does, in fact, work exactly as documented, and that the
> configuration being complained about is one that is specifically documented
> as being unsupported and undesirable.
>
> As commented elsewhere in this thread, currently, there is no
> virtualization support for System V IPC in the FreeBSD Jail implementation.
> That may change if/when someone implements it. Until it's implemented, it
> isn't going to be there, and the system won't behave as though it's there
> no matter how much jumping up and down is done.

sysvipc has been implemented once, but it has been decided that it adds
unnecessary bloat. That's sad.

/fjoe

From:	Robert Watson <rwatson(at)FreeBSD(dot)org>
To:	Max Khon <fjoe(at)samodelkin(dot)net>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Kris Kennaway <kris(at)obsecurity(dot)org>, freebsd-stable(at)FreeBSD(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: semaphore usage "port based"?
Date:	2006-05-17 10:37:49
Message-ID:	20060517113507.W49041@fledge.watson.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 9 May 2006, Max Khon wrote:

>> Yes, there seems to be an awful lot of noise being made about the fact that
>> the system does, in fact, work exactly as documented, and that the
>> configuration being complained about is one that is specifically documented
>> as being unsupported and undesirable.
>>
>> As commented elsewhere in this thread, currently, there is no
>> virtualization support for System V IPC in the FreeBSD Jail implementation.
>> That may change if/when someone implements it. Until it's implemented, it
>> isn't going to be there, and the system won't behave as though it's there
>> no matter how much jumping up and down is done.
>
> sysvipc has been implemented once, but it has been decided that it adds
> unnecessary bloat. That's sad.

I'm not sure I follow the reasoning behind this statement. Could you direct
me to the implementation, and at the specific claim that it adds unnecessary
bloat? As far as I know, no implementation of jail support for system v ipc
has ever been rejected on the basis that it adds bloat -- all discussion about
it has centered on the fact that it is, in fact, a very difficult technical
problem to solve, which requires careful consideration of the approach and
tradeoffs, and that that careful consideration has not yet bene done.

Robert N M Watson