Re: Strange hanging bug in a simple milter

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Vesa-Matti J Kari <vmkari(at)cc(dot)helsinki(dot)fi>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Strange hanging bug in a simple milter
Date: 2013-09-13 18:35:29
Message-ID: 20130913183529.GI1330627@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-09-13 14:33:25 -0400, Stephen Frost wrote:
> * Stephen Frost (sfrost(at)snowman(dot)net) wrote:
> > * Andres Freund (andres(at)2ndquadrant(dot)com) wrote:
> > > Hm. close_SSL() first does pqsecure_destroy() which will unset the
> > > callbacks, and the count and then goes on to do X509_free() and
> > > ENGINE_finish(), ENGINE_free() if either is used.
> > >
> > > It's not implausible that one of those actually needs locking. I doubt
> > > engines play a role here, but, without having looked at the testcase,
> > > X509_free() might be a possibility.
> >
> > Unfortunately, while I can still easily get the deadlock to happen when
> > the hooks are reset, the hooks don't appear to ever get called when
> > ssl_open_connections is set to zero. You have a good point about the
> > additional SSL calls after the hooks are unloaded though, I wonder if
> > holding the ssl_config_mutex lock over all of close_SSL might be more
> > sensible..
>
> I went ahead and moved the locks to be around all of close_SSL() and
> haven't been able to reproduce the deadlock, so perhaps those calls are
> the issue and what's happening is that another thread is dropping or
> adding the hooks in a common place while the X509_free, etc, are trying
> to figure out if they should be calling the locking functions or not,
> but there's a race because there's no higher-level locking happening
> around those.
>
> Attached is a patch to move those and which doesn't deadlock for me.

It seems slightly cleaner to just move the pqsecure_destroy(); to the
end of that function, based on a boolean. But if you think otherwise, I
won't protest...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2013-09-13 18:59:54 Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Previous Message Stephen Frost 2013-09-13 18:33:25 Re: Strange hanging bug in a simple milter