Re: Some 9.5beta2 backend processes not terminating properly?

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Shay Rojansky <roji(at)roji(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Some 9.5beta2 backend processes not terminating properly?
Date: 2016-01-02 13:10:38
Message-ID: CAA4eK1LFEwohKjuDAfAdxqztq4rTk4-PXp0q73rHUmU0xDrkaQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 2, 2016 at 5:02 PM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:

> On 2016-01-02 12:05, Amit Kapila wrote:
>>
>> I am also able to reproduce now. The reason was that I didn't have
>> latest .Net framework and Visual Studio, which is must for the recent
>> version of Npgsql.
>>
>> One probable reason of the problem seems to be that now for windows, we
>> are emulating non-blocking behaviour by setting pgwin32_noblock = true
>> which makes function pgwin32_recv() return EWOULDBLOCK and it would
>> wait using WaitLatchOrSocket() instead of pgwin32_waitforsinglesocket().
>> There are some differences in the way both the API's (WaitLatchOrSocket()
>> and pgwin32_waitforsinglesocket()) do wait, now may be that is the reason
>> for this behaviour. One thing I have tried is that if I don't
>> set pgwin32_noblock
>> in secure_raw_read(), then this problem won't occur which lead to above
>> reasoning. I am still investigating.
>>
>>
> Well, without pgwin32_noblock = true we never enter the code block which
> calls WaitLatchOrSocket and hangs as in my testing this was always called
> because of EWOULDBLOCK.
>
>
What I wanted to say is that the handling of socket closure is not
same in WaitLatchOrSocket() and pgwin32_waitforsinglesocket()
due to which this problem can arise and it seems that is the
right line of direction to pursue. I have found that
in WaitLatchOrSocket(),
even when the socket is closed, we remember the result as
WL_SOCKET_READABLE and again tries to wait whereas the
same is handled properly in pgwin32_waitforsinglesocket(). If we
remember the closed socket event and then take appropriate action,
then this problem won't happen. Attached patch which by no-means
a complete fix shows what I wanted to say and after this the problem
mentioned by Shay doesn't happen, although I get LOG message
which is due to the reason that proper handling for socket closure
needs to be done in this path. This patch is based on the code
after commit 387da18874afa17156ee3af63766f17efb53c4b9. I
will do testing and refine the fix based on HEAD later as I am done
for the today.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
win_socket_wait_issue_v1.patch application/octet-stream 1.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-01-02 13:26:47 Re: Some 9.5beta2 backend processes not terminating properly?
Previous Message Michael Paquier 2016-01-02 12:21:14 Release notes of 9.0~9.3 mentioning recovery_min_apply_delay incorrectly