Re: libpq does not manage SSL callbacks properly when other libraries are involved.

From: Russell Smith <mr-russ(at)pws(dot)com(dot)au>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: PoolSnoopy <tlatzelsberger(at)gmx(dot)at>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: libpq does not manage SSL callbacks properly when other libraries are involved.
Date: 2008-09-01 11:48:30
Message-ID: 48BBD68E.3010705@pws.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Alvaro Herrera wrote:
> PoolSnoopy wrote:
>
>> ***PUSH***
>>
>> this bug is really some annoyance if you use automatic build environments.
>> I'm using phpunit to run tests and as soon as postgres is involved the php
>> cli environment segfaults at the end. this can be worked around by disabling
>> ssl but it would be great if the underlying bug got fixed.
>>
>
> This is PHP's bug, isn't it? Why are you complaining here
No, this is a problem with the callback/exit functions used by
PostgreSQL. We setup callback functions when we use SSL, if somebody
else uses SSL we can create a problem.

I thought my original report was detailed enough to explain where the
problem is coming from. Excerpt from original report;

This is part of a comment from the php bug comment history;

*[12 Nov 2007 2:45pm UTC] sam at zoy dot org*

Hello, I did read the sources and studied them, and I can confirm
that it is a matter of callback jumping to an invalid address.

libpq's init_ssl_system() installs callbacks by calling
CRYPTO_set_id_callback() and CRYPTO_set_locking_callback(). This
function is called each time initialize_SSL() is called (for instance
through the PHP pg_connect() function) and does not keep a reference
counter, so libpq's destroy_SSL() has no way to know that it should
call a destroy_ssl_system() function, and there is no such function
anyway. So the callbacks are never removed.

But then, upon cleanup, PHP calls zend_shutdown() which properly
unloads pgsql.so and therefore the unused libpq.

Finally, the zend_shutdown procedure calls zm_shutdown_curl()
which in turn calls curl_global_cleanup() which leads to an
ERR_free_strings() call and eventually a CRYPTO_lock() call.
CRYPTO_lock() checks whether there are any callbacks to call,
finds one (the one installed by libpg), calls it, and crashes
because libpq was unloaded and hence the callback is no longer
in mapped memory.

--

Basically postgresql doesn't cancel the callbacks to itself when the pg
connection is shut down. So if the libpq library is unloaded before
other libraries that use SSL you get a crash as described above. PHP
has suggested the fix is to keep a reference counter in libpq so knows
when to remove the callbacks.

This is a complicated bug, but without real evidence there is no way to
go to back to PHP and say it's their fault. Their analysis is
relatively comprehensive compared to the feedback that's been posted
here so far. I'm not sure how best to setup an environment to replicate
the bug in a way I can debug it. And even if I get to the point of
nailing it down, I'll just be back asking questions about how you would
fix it because I know very little about SSL.

All that said, a quick poke in the source of PostgreSQL says that
fe-secure.c sets callbacks using CRYPTO_set_xx_callback(...). These are
only set in the threaded version it appears. Which is pretty much
default in all the installations I encounter.

My google research indicated we need to call
CRYPTO_set_xx_callback(NULL) when we exit. but that's not done. One
idea for a fix is to add a counter to the initialize_ssl function and
when destory_ssl is called, decrement the counter. If it reaches 0 then
call CRYPT_set_xx_callback(NULL) to remove the callbacks. This is a
windows SSL thread that crashes iexplore and testifies to the same
problem http://www.mail-archive.com/openssl-users(at)openssl(dot)org/msg53869.html

Thoughts?

Regards

Russell Smith

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Oleg Serov 2008-09-01 12:03:36 Bug with FOR ... LOOP and composite types
Previous Message Marcus Locatelli 2008-09-01 02:30:10 Error connecting

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2008-09-01 13:08:23 Re: Is this really really as designed or defined in some standard
Previous Message Asko Oja 2008-09-01 11:13:58 Re: Attaching error cursor position to invalid constant values