Re: Escaping from blocked send() reprised.

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, <robertmhaas(at)gmail(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Escaping from blocked send() reprised.
Date: 2014-08-25 12:29:49
Message-ID: 53FB2C3D.6030303@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07/01/2014 06:26 AM, Kyotaro HORIGUCHI wrote:
> At Mon, 30 Jun 2014 11:27:47 -0400, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in <CA+TgmoZfcGzAEmtbyoCe6VdHnq085x+ox752zuJ2AKN=Wc8PnQ(at)mail(dot)gmail(dot)com>
>> 1. I think it's the case that there are platforms around where a
>> signal won't cause send() to return EINTR.... and I'd be entirely
>> unsurprised if SSL_write() doesn't necessarily return EINTR in that
>> case. I'm not sure what, if anything, we can do about that.

We use a custom "write" routine with SSL_write, where we call send()
ourselves, so that's not a problem as long as we put the check in the
right place (in secure_raw_write(), after my recent SSL refactoring -
the patch needs to be rebased).

> man 2 send on FreeBSD has not description about EINTR.. And even
> on linux, send won't return EINTR for most cases, at least I
> haven't seen that. So send()=-1,EINTR seems to me as only an
> equivalent of send() = 0. I have no idea about what the
> implementer thought the difference is.

As the patch stands, there's a race condition: if the SIGTERM arrives
*before* the send() call, the send() won't return EINTR anyway. So
there's a chance that you still block. Calling pq_terminate_backend()
again will dislodge it (assuming send() returns with EINTR on signal),
but I don't think we want to define the behavior as "usually,
pq_terminate_backend() will kill a backend that's blocked on sending to
the client, but sometimes you have to call it twice (or more!) to really
kill it".

A more robust way is to set ImmediateInterruptOK before calling send().
That wouldn't let you send data that can be sent without blocking
though. For that, you could put the socket to non-blocking mode, and
sleep with select(), also waiting for the process' latch at the same
time (die() sets the latch, so that will wake up the select() if a
termination request arrives).

Is it actually safe to process the die-interrupt where send() is called?
ProcessInterrupts() does "ereport(FATAL, ...)", which will attempt to
send a message to the client. If that happens in the middle of
constructing some other message, that will violate the protocol.

>> 2. I think it would be reasonable to try to kill off the connection
>> without notifying the client if we're unable to send the data to the
>> client in a reasonable period of time. But I'm unsure what "a
>> reasonable period of time" means. This patch would basically do it
>> after no delay at all, which seems like it might be too aggressive.
>> However, I'm not sure.
>
> I think there's no such a reasonable time.

I agree it's pretty hard to define any reasonable timeout here. I think
it would be fine to just cut the connection; even if you don't block
while sending, you'll probably reach a CHECK_FOR_INTERRUPT() somewhere
higher in the stack and kill the connection almost as abruptly anyway.
(you can't violate the protocol, however)

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2014-08-25 12:40:22 Re: LIMIT for UPDATE and DELETE
Previous Message Andres Freund 2014-08-25 11:35:47 Switch pg_basebackup to use -X stream instead of -X fetch by default?