From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Thomas Munro <munro(at)ip9(dot)org> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Multithreaded SIGPIPE race in libpq on Solaris |
Date: | 2014-08-28 22:45:31 |
Message-ID: | 12130.1409265931@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thomas Munro <munro(at)ip9(dot)org> writes:
> My theory is that if two connections accessed by different threads get
> shut down around the same time, there is a race scenario where each of
> them fails to write to its socket, sees errno == EPIPE and then sees a
> pending SIGPIPE with sigpending(), but only one thread returns from
> sigwait() due to signal merging.
Hm, that does sound like it could be a problem, if the platform fails
to track pending SIGPIPE on a per-thread basis.
> We never saw the problem again after we made the following change:
> ...
> Does this make any sense?
I don't think that patch would fix the problem if it's real. It would
prevent libpq from hanging up when it's trying to throw away a pending
SIGPIPE, but the fundamental issue is that that action could cause a
SIGPIPE that's meant for some other thread to get lost; and that other
thread isn't necessarily doing anything with libpq.
I don't claim to be an expert on this stuff, but I had the idea that
multithreaded environments were supposed to track signal state per-thread
not just per-process, precisely because of issues like this.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2014-08-28 23:00:57 | Re: [BUGS] BUG #10823: Better REINDEX syntax. |
Previous Message | Tom Lane | 2014-08-28 22:34:37 | Re: Why data of timestamptz does not store value of timezone passed to it? |