Re: Latch implementation that wakes on postmaster death on both win32 and Unix

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Latch implementation that wakes on postmaster death on both win32 and Unix
Date: 2011-07-05 02:59:43
Message-ID: CAHGQGwFaY=zqPRvHiTN=vk4oHCjKKM+EyBEo_QsU1ZGDJYxzmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 5, 2011 at 1:36 AM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
> On Jul4, 2011, at 17:53 , Heikki Linnakangas wrote:
>>>       Under Linux, select() may report a socket file descriptor as "ready for
>>>       reading",  while nevertheless a subsequent read blocks.  This could for
>>>       example happen when data has arrived but  upon  examination  has  wrong
>>>       checksum and is discarded.  There may be other circumstances in which a
>>>       file descriptor is spuriously reported as ready.  Thus it may be  safer
>>>       to use O_NONBLOCK on sockets that should not block.
>>
>> So in theory, on Linux you might WaitLatch might sometimes incorrectly return WL_POSTMASTER_DEATH. None of the callers check for WL_POSTMASTER_DEATH return code, they call PostmasterIsAlive() before assuming the postmaster has died, so that won't affect correctness at the moment. I doubt that scenario can even happen in our case, select() on a pipe that is never written to. But maybe we should add add an assertion to WaitLatch to assert that if select() reports that the postmaster pipe has been closed, PostmasterIsAlive() also returns false.
>
> The correct solution would be to read() from the pipe after select()
> returns, and only return WL_POSTMASTER_DEATCH if the read doesn't return
> EAGAIN. To prevent that read() from blocking if the read event was indeed
> spurious, O_NONBLOCK must be set on the pipe but that patch does that already.

+1

The syslogger read() from the pipe after select(), then it thinks EOF
has arrived and
there is no longer write-side process if the return value of read() is
just zero.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2011-07-05 03:34:58 Re: Cascade replication
Previous Message Joseph Adams 2011-07-05 02:22:02 Re: Initial Review: JSON contrib modul was: Re: Another swing at JSON