Re: PATCH: Keep one postmaster monitoring pipe per process

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marco Pfatschbacher <Marco_Pfatschbacher(at)genua(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: Keep one postmaster monitoring pipe per process
Date: 2016-09-15 20:24:07
Message-ID: 13213.1473971047@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Marco Pfatschbacher <Marco_Pfatschbacher(at)genua(dot)de> writes:
> the current implementation of PostmasterIsAlive() uses a pipe to
> monitor the existence of the postmaster process.
> One end of the pipe is held open in the postmaster, while the other end is
> inherited to all the auxiliary and background processes when they fork.
> This leads to multiple processes calling select(2), poll(2) and read(2)
> on the same end of the pipe.
> While this is technically perfectly ok, it has the unfortunate side
> effect that it triggers an inefficient behaviour[0] in the select/poll
> implementation on some operating systems[1]:
> The kernel can only keep track of one pid per select address and
> thus has no other choice than to wakeup(9) every process that
> is waiting on select/poll.

That seems like a rather bad kernel bug.

> In our case the system had to wakeup ~3000 idle ssh processes
> every time postgresql did call PostmasterIsAlive.

Uh, these are processes not even connected to the pipe in question?
That's *really* a kernel bug.

> Attached patch avoids the select contention by using a
> separate pipe for each auxiliary and background process.

I think this would likely move the performance problems somewhere else.
In particular, it would mean that every postmaster child would inherit
pipes leading to all the older children. We could close 'em again
I guess, but we have previously found that having to do things that
way is a rather serious performance drag --- see the problems we had
with POSIX named semaphores, here for instance:
https://www.postgresql.org/message-id/flat/3830CBEB-F8CE-4EBC-BE16-A415E78A4CBC%40apple.com
I really don't want the postmaster to be holding any per-child kernel
resources.

It'd certainly be nice if we could find another solution besides
the pipe-based one, but I don't think "more pipes" is the answer.
There was some discussion of using Linux's prctl(PR_SET_PDEATHSIG)
when available; do the BSDen have anything like that?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-09-15 20:32:22 Re: Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)
Previous Message Heikki Linnakangas 2016-09-15 20:00:29 Re: OpenSSL 1.1 breaks configure and more