Re: Escaping from blocked send() reprised.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: andres(at)2ndquadrant(dot)com
Cc: hlinnakangas(at)vmware(dot)com, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Escaping from blocked send() reprised.
Date: 2014-10-02 08:47:39
Message-ID: 20141002.174739.24593737.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

> > I propose the attached patch. It adds a new flag ImmediateDieOK, which is a
> > weaker form of ImmediateInterruptOK that only allows handling a pending
> > die-signal in the signal handler.
> >
> > Robert, others, do you see a problem with this?
>
> Per se I don't have a problem with it. There does exist the problem that
> the user doesn't get a error message in more cases though. On the other
> hand it's bad if any user can prevent the database from restarting.
>
> > Over IM, Robert pointed out that it's not safe to jump out of a signal
> > handler with siglongjmp, when we're inside library calls, like in a callback
> > called by OpenSSL. But even with current master branch, that's exactly what
> > we do. In secure_raw_read(), we set ImmediateInterruptOK = true, which means
> > that any incoming signal will be handled directly in the signal handler,
> > which can mean elog(ERROR). Should we be worried? OpenSSL might get confused
> > if control never returns to the SSL_read() or SSL_write() function that
> > called secure_raw_read().
>
> But this is imo prohibitive. Yes, we're doing it for a long while. But
> no, that's not ok. It actually prompoted me into prototyping the latch
> thing (in some other thread). I don't think existing practice justifies
> expanding it further.

I see, in that case, this approach seems basically
applicable. But if I understand correctly, this patch seems not
to return out of the openssl code even when latch was found to be
set in secure_raw_write/read. I tried setting errno = ECONNRESET
and it went well but seems a bad deed.

secure_raw_write(Port *port, const void *ptr, size_t len)
{
n = send(port->sock, ptr, len, 0);

if (!port->noblock && n < 0 && (errno == EWOULDBLOCK || errno == EAGAIN))
{
w = WaitLatchOrSocket(&MyProc->procLatch, ...

if (w & WL_LATCH_SET)
{
ResetLatch(&MyProc->procLatch);
/*
* Force a return, so interrupts can be processed when not
* (possibly) underneath a ssl library.
*/
errno = EINTR;
(return n; // n is negative)

my_sock_write(BIO *h, const char *buf, int size)
{
res = secure_raw_write(((Port *) h->ptr), buf, size);
BIO_clear_retry_flags(h);
if (res <= 0)
{
if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN)
{
BIO_set_retry_write(h);

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2014-10-02 08:49:31 Re: Replication identifiers, take 3
Previous Message Simon Riggs 2014-10-02 08:19:30 Re: Yet another abort-early plan disaster on 9.3