Re: SSL renegotiation

Lists: pgsql-committerspgsql-hackers
From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Sean Chittenden <sean(at)chittenden(dot)org>
Subject: SSL renegotiation
Date: 2013-07-10 21:20:17
Message-ID: 20130710212017.GB4941@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hi,

I'm having a look at the SSL support code, because one of our customers
reported it behaves unstably when the network is unreliable. I have yet
to reproduce the exact problem they're having, but while reading the
code I notice this in be-secure.c:secure_write() :

if (ssl_renegotiation_limit && port->count > ssl_renegotiation_limit * 1024L)
{
SSL_set_session_id_context(port->ssl, (void *) &SSL_context,
sizeof(SSL_context));
if (SSL_renegotiate(port->ssl) <= 0)
ereport(COMMERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg("SSL renegotiation failure")));
if (SSL_do_handshake(port->ssl) <= 0)
ereport(COMMERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg("SSL renegotiation failure")));
if (port->ssl->state != SSL_ST_OK)
ereport(COMMERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg("SSL failed to send renegotiation request")));
port->ssl->state |= SSL_ST_ACCEPT;
SSL_do_handshake(port->ssl);
if (port->ssl->state != SSL_ST_OK)
ereport(COMMERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg("SSL renegotiation failure")));
port->count = 0;
}

I call your attention to the fact that we're calling SSL_do_handshake
twice here; and what's more, we're messing with the internal
port->ssl->state variable by setting the SSL_ST_ACCEPT bit. According
to the commit message that introduces this[1], this is "text book
correct", but I'm unable to find the text book that specifies that this
is correct. Is it? I have gone through the OpenSSL documentation and
can find not a single bit on this; and I have also gone through the
OpenSSL mailing list archives and as far as I can tell they say you have
to call SSL_renegotiate() and then SSL_do_handshake() once, and that's
it. (You can call SSL_renegotiate_pending() afterwards to verify that
the renegotiation completed successfully, which is something we don't
do.)

I have found in our archives several reports of people that get "SSL
renegotiation failed" error messages in the log, and no explanation has
ever been found. I instrumented that block, and I have observed that
after the second handshake call, ssl->state is 0x2111, 0x21c0, 0x21a0
and other values; never SSL_ST_OK. (0x2000 is SSL_ST_ACCEPT which is
the bit we added; SSL_ST_OK is 0x03). If I remove the second
SSL_do_handshake() call and the changes to port->ssl->state, everything
appears to work perfectly well and there's no extra log spam (and
ssl->state is SSL_ST_OK by the time things are finished). So apparently
renegotiation is not actually failing, but we're just adding a confusing
error message for no apparent reason.

Thoughts?

I have still to find where the actual problems are happening in
unreliable networks ...

[1] commit 17386ac45345fe38a10caec9d6e63afa3ce31bb9
Author: Bruce Momjian <bruce(at)momjian(dot)us>
Date: Wed Jun 11 15:05:50 2003 +0000

patch by Sean Chittenden (in CC), posted as [2]

[2] http://www.postgresql.org/message-id/20030419190821.GQ79923@perrin.int.nxad.com

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Sean Chittenden <sean(at)chittenden(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-07-10 22:34:44
Message-ID: 20130710223444.GF4941@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

I think this block is better written as:

if (ssl_renegotiation_limit && port->count > ssl_renegotiation_limit * 1024L)
{
SSL_set_session_id_context(port->ssl, (void *) &SSL_context,
sizeof(SSL_context));
if (SSL_renegotiate(port->ssl) <= 0)
ereport(COMMERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg("SSL renegotiation failure in renegotiate")));
else
{
int handshake;

do {
handshake = SSL_do_handshake(port->ssl);
if (handshake <= 0)
ereport(COMMERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg("SSL renegotiation failure in handshake, retrying")));
} while (handshake <= 0);

if (port->ssl->state != SSL_ST_OK)
ereport(COMMERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg("SSL failed to send renegotiation request")));
else
port->count = 0;
}
}

In other words, retry the SSL_do_handshake() until it reports OK, but
not more than that; this seems to give better results in my (admittedly
crude) experiments. I am unsure about checking port->ssl->state after
the handshake; it's probably pointless, really.

In any case, the old code was calling SSL_do_handshake() even if
SSL_renegotiate() failed; and it was resetting the port->count even if
the handshake failed. Both of these smell like bugs to me.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Sean Chittenden <sean(at)chittenden(dot)org>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [SPAM] SSL renegotiation
Date: 2013-07-10 23:58:07
Message-ID: 51DDF50F.2030402@chittenden.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Wow, that was a long time ago... I remember a few things about this

1) I was running in to an issue where every 64KB of transfer (or
something inanely low like that), SSL was being renegotiated. This was
causing performance problems over the wire. I think we settled on once
an hour renegotiating the key, but I'd have to look again to check. It
looks like the default is now 512MB, which is a more sane limit, but
still pretty easy to exhaust - I have several hosts that would run past
that default every 45sec or so, probably faster at peak times.

2) The system that I was running this on was Solaris 2.5, I believe,
and /dev/random was having a problem staying populated given the
frequent renegotiations, which prompted me to look in to this. In your
testing and attempts to repro, try draining your prng pool, or patch
things on Linux to read from /dev/random instead of /dev/urandom...
something like that may be at fault and why limited testing won't
expose this, but under load you might see it. *shrug* A WAG, but
possibly relevant.

Rough notes inline below (almost all of this should be wrapped in an
<iirc> block):

> if (ssl_renegotiation_limit && port->count > ssl_renegotiation_limit * 1024L)

This doesn't seem right. 512MB * 1024? Maybe that's why I haven't
actually had to play with this limit in a long time. Every 512GB is
much more reasonable in that it would take 12hrs to renegotiate on a
busy host. The "* 1024L" seems suspicious to me and should probably be
removed in favor of the constant passed in from the config.

> {
> SSL_set_session_id_context(port->ssl, (void *) &SSL_context,
> sizeof(SSL_context));
> if (SSL_renegotiate(port->ssl) <= 0)
> ereport(COMMERROR,
> (errcode(ERRCODE_PROTOCOL_VIOLATION),
> errmsg("SSL renegotiation failure")));

This sets a bit asking the peer to renegotiation.

> if (SSL_do_handshake(port->ssl) <= 0)
> ereport(COMMERROR,
> (errcode(ERRCODE_PROTOCOL_VIOLATION),
> errmsg("SSL renegotiation failure")));
> if (port->ssl->state != SSL_ST_OK)
> ereport(COMMERROR,
> (errcode(ERRCODE_PROTOCOL_VIOLATION),
> errmsg("SSL failed to send renegotiation request")));

Push the request to renegotiate out to the peer and check the status.

> port->ssl->state |= SSL_ST_ACCEPT;
> SSL_do_handshake(port->ssl);

In OpenSSL 0.9.6 this was the correct way to renegotiate a connection
and I would need to confirm if this is still required for OpenSSL >=
0.9.7.

> if (port->ssl->state != SSL_ST_OK)
> ereport(COMMERROR,
> (errcode(ERRCODE_PROTOCOL_VIOLATION),
> errmsg("SSL renegotiation failure")));
> port->count = 0;
> }
>
> I call your attention to the fact that we're calling SSL_do_handshake
> twice here; and what's more, we're messing with the internal
> port->ssl->state variable by setting the SSL_ST_ACCEPT bit. According
> to the commit message that introduces this[1], this is "text book
> correct", but I'm unable to find the text book that specifies that this
> is correct. Is it? I have gone through the OpenSSL documentation and
> can find not a single bit on this; and I have also gone through the
> OpenSSL mailing list archives and as far as I can tell they say you have
> to call SSL_renegotiate() and then SSL_do_handshake() once, and that's
> it. (You can call SSL_renegotiate_pending() afterwards to verify that
> the renegotiation completed successfully, which is something we don't
> do.)

It would not surprise me if we could #ifdef out the "|= SSL_ST_ACCEPT"
and second call to SSL_do_handshake() if we're running OpenSSL 0.9.7 or
newer. iirc, 0.9.7's big feature was improving renegotiation.

> I have found in our archives several reports of people that get "SSL
> renegotiation failed" error messages in the log, and no explanation has
> ever been found. I instrumented that block, and I have observed that
> after the second handshake call, ssl->state is 0x2111, 0x21c0, 0x21a0
> and other values; never SSL_ST_OK. (0x2000 is SSL_ST_ACCEPT which is
> the bit we added; SSL_ST_OK is 0x03). If I remove the second
> SSL_do_handshake() call and the changes to port->ssl->state, everything
> appears to work perfectly well and there's no extra log spam (and
> ssl->state is SSL_ST_OK by the time things are finished). So apparently
> renegotiation is not actually failing, but we're just adding a confusing
> error message for no apparent reason.
>
> Thoughts?

Since 0.9.6 almost certainly has vulnerabilities that haven't been
fixed, can we just depreciate anything older than OpenSSL 0.9.7 or
#ifdef the above out? 0.9.7 is also ancient, but we're talking about
new releases... and while we're at it, can we prefer PFS ciphers?

> I have still to find where the actual problems are happening in
> unreliable networks ...

I'm guessing you're blocking on /dev/random on some systems and that's
the source of unreliability/timeouts.

> [1] commit 17386ac45345fe38a10caec9d6e63afa3ce31bb9
> Author: Bruce Momjian <bruce(at)momjian(dot)us>
> Date: Wed Jun 11 15:05:50 2003 +0000
>
> patch by Sean Chittenden (in CC), posted as [2]
>
> [2] http://www.postgresql.org/message-id/20030419190821.GQ79923@perrin.int.nxad.com

If you want, I can dig in to this further.... my brain's way back
machine isn't as on-demand as it used to be, but I can certainly help
clean some of this up. -sc

--
Sean Chittenden


From: Troels Nielsen <bn(dot)troels(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Sean Chittenden <sean(at)chittenden(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-07-11 01:38:02
Message-ID: CAOdE5WSyb4Y5sCwe1CqBYmLoK7ZAON3R7aBEa8OEX0LVnff_Og@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hi,

These are the relevant bits from Apache2.4's mod_ssl.

SSL_renegotiate(ssl);
SSL_do_handshake(ssl);

if (SSL_get_state(ssl) != SSL_ST_OK) {
ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, APLOGNO(02225)
"Re-negotiation request failed");
ssl_log_ssl_error(SSLLOG_MARK, APLOG_ERR, r->server);

r->connection->keepalive = AP_CONN_CLOSE;
return HTTP_FORBIDDEN;
}

ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r, APLOGNO(02226)
"Awaiting re-negotiation handshake");

/* XXX: Should replace setting state with SSL_renegotiate(ssl);
* However, this causes failures in perl-framework currently,
* perhaps pre-test if we have already negotiated?
*/
#ifdef OPENSSL_NO_SSL_INTERN
SSL_set_state(ssl, SSL_ST_ACCEPT);
#else
ssl->state = SSL_ST_ACCEPT;
#endif
SSL_do_handshake(ssl);

sslconn->reneg_state = RENEG_REJECT;

if (SSL_get_state(ssl) != SSL_ST_OK) {
ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, APLOGNO(02261)
"Re-negotiation handshake failed: "
"Not accepted by client!?");

r->connection->keepalive = AP_CONN_CLOSE;
return HTTP_FORBIDDEN;
}

That code supports at least OpenSSL 0.9.7 and later.

Some explanation for it can be found here:

http://books.google.dk/books?id=IIqwAy4qEl0C&pg=PT184&lpg=PT184&dq=SSL_do_handshake&source=bl&ots=ma632U4vWv&sig=d2qqS0svhu4EwIZZaONdHoTulVI&hl=en&sa=X&ei=xdPdUczoDuf34QSzmoDQDg&ved=0CIIDEOgBMCo

The snippet there is probably the textbook implementation.

So the original code looks OK. Perhaps the check
on the return code of the first SSL_do_handshake (and SSL_renegotiate)
is unnecessary and may lead to unwarranted error messages, though.
And it may be wrong to continue the renegotiation if
the state is not SSL_ST_OK after the first SSL_do_handshake.

If the renegotiation fails, I suppose two things can be done:
1. Quit the connection
2. Carry on pretending nothing happened.

I think 2. is correct in the vast majority of cases (as it looks like is
being done now).
And in that case: not resetting port->count will make for a very slow
and awkward connection as new handshakes will be attempted again and again,
even if the other party persistently refuse to shake hands.

Kind Regards
Troels Nielsen

On Thu, Jul 11, 2013 at 12:34 AM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com>wrote:

> I think this block is better written as:
>
> if (ssl_renegotiation_limit && port->count >
> ssl_renegotiation_limit * 1024L)
> {
> SSL_set_session_id_context(port->ssl, (void *) &SSL_context,
> sizeof(SSL_context));
> if (SSL_renegotiate(port->ssl) <= 0)
> ereport(COMMERROR,
> (errcode(ERRCODE_PROTOCOL_VIOLATION),
> errmsg("SSL renegotiation failure in
> renegotiate")));
> else
> {
> int handshake;
>
> do {
> handshake = SSL_do_handshake(port->ssl);
> if (handshake <= 0)
> ereport(COMMERROR,
> (errcode(ERRCODE_PROTOCOL_VIOLATION),
> errmsg("SSL renegotiation failure in
> handshake, retrying")));
> } while (handshake <= 0);
>
> if (port->ssl->state != SSL_ST_OK)
> ereport(COMMERROR,
> (errcode(ERRCODE_PROTOCOL_VIOLATION),
> errmsg("SSL failed to send renegotiation
> request")));
> else
> port->count = 0;
> }
> }
>
> In other words, retry the SSL_do_handshake() until it reports OK, but
> not more than that; this seems to give better results in my (admittedly
> crude) experiments. I am unsure about checking port->ssl->state after
> the handshake; it's probably pointless, really.
>
> In any case, the old code was calling SSL_do_handshake() even if
> SSL_renegotiate() failed; and it was resetting the port->count even if
> the handshake failed. Both of these smell like bugs to me.
>
> --
> Álvaro Herrera http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Sean Chittenden <sean(at)chittenden(dot)org>
To: Troels Nielsen <bn(dot)troels(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-07-11 04:13:19
Message-ID: 51DE30DF.3080604@chittenden.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

> If the renegotiation fails

AH! Now I remember. SSL clients can optionally renegotiate, it's not
required to renegotiate the session if the other side chooses not to
(almost certainly due to a bug or limitation in the client's connecting
library). By monkeying with the state, you can explicitly force a client
to renegotiate.

I don't think in code from yesteryear it was portable or possible to see
if the server successfully renegotiated a connection before 0.9.6, so
you just forced the client to renegotiate after the server and ignored
the result. A client pausing for a few extra round trips was probably
never noticed. I'm not saying this is correct, but I think that was the
thinking back in the day.

> , I suppose two things can be done:
>
> 1. Quit the connection

With my Infosec hat on, this is the correct option - force the client
back in to compliance with whatever the stated crypto policy is through
a reconnection.

> 2. Carry on pretending nothing happened.

This is almost never correct in a security context (all errors or
abnormalities must boil up).

> I think 2 is correct in the vast majority of cases (as it looks like
> is being done now).

That is a correct statement in that most code disregards renegotiation,
but that is because there is a pragmatic assumption that HTTPS
connections will be short lived. In the case of PostgreSQL, there is a
good chance that a connection will be established for weeks or months.
In the case of Apache, allowing a client to renegotiate every byte would
be a possible CPU DoS, but I digress....

> And in that case: not resetting port->count will make for a very slow
> and awkward connection as new handshakes will be attempted again and
> again,
> even if the other party persistently refuse to shake hands.

Which could lead to a depletion of random bits. This sounds like a
plausible explanation to me.

Too bad we're stuck using an ill-concieved SSL implementation and can't
use botan[1].

> I think this block is better written as:
>
> if (ssl_renegotiation_limit && port->count >
> ssl_renegotiation_limit * 1024L)

I don't think the " * 1024L" is right.

> {
> SSL_set_session_id_context(port->ssl, (void *)
> &SSL_context,
> sizeof(SSL_context));
> if (SSL_renegotiate(port->ssl) <= 0)
> ereport(COMMERROR,
> (errcode(ERRCODE_PROTOCOL_VIOLATION),
> errmsg("SSL renegotiation failure in
> renegotiate")));
> else
> {
> int handshake;
>
> do {
> handshake = SSL_do_handshake(port->ssl);
> if (handshake <= 0)
> ereport(COMMERROR,
> (errcode(ERRCODE_PROTOCOL_VIOLATION),
> errmsg("SSL renegotiation failure
> in handshake, retrying")));
> } while (handshake <= 0);

It's worth noting that for broken SSL implementations or SSL
implementations that refuse to renegotiate, this will be yield a stalled
connection, though at least it will be obvious in the logs. I think this
is the correct approach.

It is probably prudent to set an upper bound on this loop in order to
free up the resource and unblock the client who will appear to be
mysteriously hung for no reason until they look at the PostgreSQL server
logs.

> if (port->ssl->state != SSL_ST_OK)
> ereport(COMMERROR,
> (errcode(ERRCODE_PROTOCOL_VIOLATION),
> errmsg("SSL failed to send
> renegotiation request")));
> else
> port->count = 0;
> }
> }
>
> In other words, retry the SSL_do_handshake() until it reports OK, but
> not more than that; this seems to give better results in my
> (admittedly
> crude) experiments. I am unsure about checking port->ssl->state after
> the handshake; it's probably pointless, really.

Correct. In async IO, this would be important, but since the server is
synchronous in its handling of communication, we can remove the if/else
(state != SSL_ST_OK) block.

> In any case, the old code was calling SSL_do_handshake() even if
> SSL_renegotiate() failed; and it was resetting the port->count even if
> the handshake failed. Both of these smell like bugs to me.

I don't know how SSL_renegotiate() could fail in the past.
SSL_renegotiate(3) should never fail on a well formed implementation
(e.g. ssl/t1_reneg.c @ssl_add_serverhello_renegotiate_ext()).

While we're on the subject of crypto and OpenSSL, we force server
ciphers to be preferred instead of client ciphers:

SSL_CTX_set_options(SSL_context, SSL_OP_CIPHER_SERVER_PREFERENCE);

http://www.openssl.org/docs/ssl/SSL_CTX_set_options.html#NOTES

SSL_get_secure_renegotiation_support() would be a good call to add to
autoconf to see if it is supported (OpenSSL >= 0.9.8m), which would make
this code a good bit easier. Add a GUC,
ssl_renegotiation_require=(true,false,disconnect), which would force
renegotiations, ignore renegotiation failures, and disconnect a client
if it fails. As mentioned above re: breaking out of the loop, there
should probably be a ssl_renegotiation_min tunable so that the client
can't renegotiate too fast.

The default ssl_ciphers in the examples should also be updated as well
(e.g. we still allow SSLv2):

ssl_ciphers =
'ECDHE-RSA-AES128-SHA256:AES128-GCM-SHA256:RC4:HIGH:!MD5:!aNULL:!EDH:@STRENGTH'

Longer, but current with today's security standards.

-sc

[1] http://botan.randombit.net/tls.html?highlight=renegotiate

--
Sean Chittenden


From: Stuart Bishop <stuart(at)stuartbishop(dot)net>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Sean Chittenden <sean(at)chittenden(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-07-11 06:20:06
Message-ID: CADmi=6M=b8OqvGStWsjum165ySrquOTDSRme=OW98gpOUh8_Tw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Thu, Jul 11, 2013 at 4:20 AM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:

> I'm having a look at the SSL support code, because one of our customers
> reported it behaves unstably when the network is unreliable. I have yet
> to reproduce the exact problem they're having, but while reading the
> code I notice this in be-secure.c:secure_write() :

The recap of my experiences you requested...

I first saw SSL renegotiation failures on Ubuntu 10.04 LTS (Lucid)
with openssl 0.9.8 (something). I think this was because SSL
renegotiation had been disabled due to due to CVE 2009-3555 (affecting
all versions before 0.9.8l). I think the version now in lucid is
0.9.8k with fixes for SSL renegotiation, but I haven't tested this.

The failures I saw with no-renegotiation-SSL for streaming replication
looked like this:

On the master:

2012-06-25 16:16:26 PDT LOG: SSL renegotiation failure
2012-06-25 16:16:26 PDT LOG: SSL error: unexpected record
2012-06-25 16:16:26 PDT LOG: could not send data to client: Connection
reset by peer

On the hot standby:

2012-06-25 11:12:11 PDT FATAL: could not receive data from WAL stream:
SSL error: sslv3 alert unexpected message
2012-06-25 11:12:11 PDT LOG: record with zero length at 1C5/95D2FE00

Now I'm running Ubuntu 12.04 LTS (Precise) with openssl 1.0.1, and I
think all the known renegotiation issues have been dealt with. I still
get failures, but they are less informative:

<postgres(at)[unknown]:19761> 2013-03-15 03:55:12 UTC LOG: SSL
renegotiation failure

--
Stuart Bishop <stuart(at)stuartbishop(dot)net>
http://www.stuartbishop.net/


From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Sean Chittenden <sean(at)chittenden(dot)org>
Cc: Troels Nielsen <bn(dot)troels(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-07-11 14:11:50
Message-ID: CAGTBQpZjO-rTdKspYZkn-0HqqVy5h6MX+E6CJ+4Rj6xcp6AHng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Thu, Jul 11, 2013 at 1:13 AM, Sean Chittenden <sean(at)chittenden(dot)org> wrote:
>> , I suppose two things can be done:
>>
>> 1. Quit the connection
>
> With my Infosec hat on, this is the correct option - force the client
> back in to compliance with whatever the stated crypto policy is through
> a reconnection.
>
>> 2. Carry on pretending nothing happened.
>
> This is almost never correct in a security context (all errors or
> abnormalities must boil up).
>
>> I think 2 is correct in the vast majority of cases (as it looks like
>> is being done now).
>
> That is a correct statement in that most code disregards renegotiation,
> but that is because there is a pragmatic assumption that HTTPS
> connections will be short lived. In the case of PostgreSQL, there is a
> good chance that a connection will be established for weeks or months.
> In the case of Apache, allowing a client to renegotiate every byte would
> be a possible CPU DoS, but I digress....

And, allowing the client to refuse to renegotiate leaves the relevant
vulnerability unpatched. Renegotiation was introduced to patch a
vulnerability in which, without renegotiation, there was the
possibility of an attacker gaining knowledge of session keys (and
hence the ability to intercept the stream).

I think 2 is not viable in this context. Only 1.


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Troels Nielsen <bn(dot)troels(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Sean Chittenden <sean(at)chittenden(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-07-12 20:32:52
Message-ID: 20130712203252.GH29206@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Troels Nielsen escribió:
> Hi,
>
> These are the relevant bits from Apache2.4's mod_ssl.
>
> [snip]

So this is basically the same thing the Pg code is doing.

> That code supports at least OpenSSL 0.9.7 and later.
>
> Some explanation for it can be found here:
>
> http://books.google.dk/books?id=IIqwAy4qEl0C&pg=PT184&lpg=PT184&dq=SSL_do_handshake&source=bl&ots=ma632U4vWv&sig=d2qqS0svhu4EwIZZaONdHoTulVI&hl=en&sa=X&ei=xdPdUczoDuf34QSzmoDQDg&ved=0CIIDEOgBMCo
>
> The snippet there is probably the textbook implementation.

Ah, thanks for the pointer. I notice this book is from 2002 and
documents OpenSSL 0.9.6.

So yeah, that's what both mod_ssl and Pg implement. However, reading
the text carefully, I see that this snippet is the correct
implementation for 0.9.6 and earlier; the same book, speaking about the
0.9.7 release in the future tense, explains that in that release it will
be much easier to do renegotiations, without getting into precise
details of how exactly is to be done (I suppose the OpenSSL team hadn't
finalized the API yet). As far as I can understand, what I propose is
the right sequence.

Now, should we support the 0.9.6-and-earlier mechanism? My inclination
is no; even RHEL 3, the oldest supported Linux distribution, uses 0.9.7
(Heck, even Red Hat Linux 9, released on 2003). To see OpenSSL 0.9.6
you need to go back to Red Hat Linux 7.2, released on 2001 using a Linux
kernel 2.4. Surely no one in their right mind would use a current
Postgres release on such an ancient animal.

So I continue to maintain that we should rip the old mechanism out and
replace it with current renegotiation.

Now, I've read a bit more of the code and it seems that we should be
doing SSL_renegotiate() and then check the SSL_renegotiate_pending()
result until it returns that the renegotiation has completed.

> So the original code looks OK. Perhaps the check
> on the return code of the first SSL_do_handshake (and SSL_renegotiate)
> is unnecessary and may lead to unwarranted error messages, though.
> And it may be wrong to continue the renegotiation if
> the state is not SSL_ST_OK after the first SSL_do_handshake.
>
> If the renegotiation fails, I suppose two things can be done:
> 1. Quit the connection
> 2. Carry on pretending nothing happened.
>
> I think 2. is correct in the vast majority of cases (as it looks like is
> being done now).
> And in that case: not resetting port->count will make for a very slow
> and awkward connection as new handshakes will be attempted again and again,
> even if the other party persistently refuse to shake hands.

Good point. From a security point of view, it seems that the correct
reaction is to close the connection if renegotiation doesn't complete.

Along the same lines, it seems to me that accepting SSLv2 (which doesn't
support renegotiations at all) when renegotiations have been requested
is not a good choice; we should accept only SSLv3 in that case.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Noah Misch <noah(at)leadboat(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Troels Nielsen <bn(dot)troels(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Sean Chittenden <sean(at)chittenden(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-07-13 00:51:52
Message-ID: 20130713005152.GA1219036@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Fri, Jul 12, 2013 at 04:32:52PM -0400, Alvaro Herrera wrote:
> Now, should we support the 0.9.6-and-earlier mechanism? My inclination
> is no; even RHEL 3, the oldest supported Linux distribution, uses 0.9.7
> (Heck, even Red Hat Linux 9, released on 2003). To see OpenSSL 0.9.6
> you need to go back to Red Hat Linux 7.2, released on 2001 using a Linux
> kernel 2.4. Surely no one in their right mind would use a current
> Postgres release on such an ancient animal.

Agreed. The OpenSSL Project last applied a security fix to 0.9.6 over eight
years ago. Compatibility with 0.9.6 has zero or negative value.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Troels Nielsen <bn(dot)troels(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Sean Chittenden <sean(at)chittenden(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-07-16 15:57:19
Message-ID: CA+TgmobqrJ+m7Ke+F+1jbDEsaCLgm0Vm8DDqAw-UfTDtzXve0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Fri, Jul 12, 2013 at 8:51 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Fri, Jul 12, 2013 at 04:32:52PM -0400, Alvaro Herrera wrote:
>> Now, should we support the 0.9.6-and-earlier mechanism? My inclination
>> is no; even RHEL 3, the oldest supported Linux distribution, uses 0.9.7
>> (Heck, even Red Hat Linux 9, released on 2003). To see OpenSSL 0.9.6
>> you need to go back to Red Hat Linux 7.2, released on 2001 using a Linux
>> kernel 2.4. Surely no one in their right mind would use a current
>> Postgres release on such an ancient animal.
>
> Agreed. The OpenSSL Project last applied a security fix to 0.9.6 over eight
> years ago. Compatibility with 0.9.6 has zero or negative value.

+1 from me as well, if any more are needed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: David Fetter <david(at)fetter(dot)org>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Troels Nielsen <bn(dot)troels(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Sean Chittenden <sean(at)chittenden(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-07-16 17:41:44
Message-ID: 20130716174144.GA29158@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Fri, Jul 12, 2013 at 08:51:52PM -0400, Noah Misch wrote:
> On Fri, Jul 12, 2013 at 04:32:52PM -0400, Alvaro Herrera wrote:
> > Now, should we support the 0.9.6-and-earlier mechanism? My
> > inclination is no; even RHEL 3, the oldest supported Linux
> > distribution, uses 0.9.7 (Heck, even Red Hat Linux 9, released on
> > 2003). To see OpenSSL 0.9.6 you need to go back to Red Hat Linux
> > 7.2, released on 2001 using a Linux kernel 2.4. Surely no one in
> > their right mind would use a current Postgres release on such an
> > ancient animal.
>
> Agreed. The OpenSSL Project last applied a security fix to 0.9.6
> over eight years ago. Compatibility with 0.9.6 has zero or negative
> value.

You've made a persuasive case that we should actively break backward
compatibility here. Would that be complicated to do?

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Noah Misch <noah(at)leadboat(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Troels Nielsen <bn(dot)troels(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Sean Chittenden <sean(at)chittenden(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-07-16 23:19:49
Message-ID: 20130716231949.GB55849@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Tue, Jul 16, 2013 at 10:41:44AM -0700, David Fetter wrote:
> On Fri, Jul 12, 2013 at 08:51:52PM -0400, Noah Misch wrote:
> > Agreed. The OpenSSL Project last applied a security fix to 0.9.6
> > over eight years ago. Compatibility with 0.9.6 has zero or negative
> > value.
>
> You've made a persuasive case that we should actively break backward
> compatibility here. Would that be complicated to do?

Nope. If Alvaro's code change builds under 0.9.6, malfunctioning only at
runtime, I suspect we would add a "configure"-time version check and possibly
a runtime one as well.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-09-20 21:18:05
Message-ID: 20130920211805.GD4832@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Here's the patch I propose to handle renegotiation cleanly. I noticed
in testing that SSL_renegotiate_pending() doesn't seem to actually work
--- if I throw an ereport(FATAL) at the point where I expect the
renegotiation to be complete, it always dies; even if I give it
megabytes of extra traffic while waiting for the renegotiation to
complete. I suspect this is an OpenSSL bug. Instead, in this patch I
check the internal renegotiation counter: grab its current value when
starting the renegotiation, and consider it complete when the counter
has advanced. This works fine.

Another thing not covered by the original code snippet I proposed
upthread is to avoid renegotiating when there was a renegotiation in
progress. This bug has been observed in the field.

Per discussion, I made it close the connection with a FATAL error if the
limit is reached and the renegotiation hasn't taken place. To do
otherwise is not acceptable for a security PoV.

Sean Chittenden mentioned that when retrying the handshake, we should be
careful to only do it a few times, not forever, to avoid a malfeasant
from grabbing hold of a connection indefinitely. I've added that too,
hardcoding the number of retries to 20.

Also, I made this code request a renegotiation slightly before the limit
is actually reached. I noticed that in some cases some traffic can go
by before the renegotiation is actually completed. The difference
should be pretty minimal.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
ssl-renegotiation.patch text/x-diff 5.1 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-09-23 20:51:24
Message-ID: 20130923205121.GM4832@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Here's an updated version; this mainly simplifies code, per comments
from Andres (things were a bit too baroque in places due to the way the
code had evolved, and I hadn't gone over it to simplify it).

The only behavior change is that the renegotiation is requested 1kB
before the limit is hit: the raise to 1% of the configured limit was
removed.

The "fixup" is an incremental on top of the previous version; the other
one is the full v2 patch.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
ssl-renegotiation-1.fixup.patch text/x-diff 4.3 KB
ssl-renegotiation-2.patch text/x-diff 4.6 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-09-24 13:29:01
Message-ID: CA+TgmoZVGmyZLx7e4ARq_5nu4uDeN7wrvg1xJXg_o9c61CHu3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Mon, Sep 23, 2013 at 4:51 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Here's an updated version; this mainly simplifies code, per comments
> from Andres (things were a bit too baroque in places due to the way the
> code had evolved, and I hadn't gone over it to simplify it).
>
> The only behavior change is that the renegotiation is requested 1kB
> before the limit is hit: the raise to 1% of the configured limit was
> removed.

What basis do we have for thinking that 1kB is definitely enough to
avoid spurious disconnects?

(I have a bad feeling that you're going to say something along the
lines of "well, we tried it a bunch of times, and...".)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-09-24 16:30:47
Message-ID: 20130924163047.GO4832@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Robert Haas escribió:
> On Mon, Sep 23, 2013 at 4:51 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
> > Here's an updated version; this mainly simplifies code, per comments
> > from Andres (things were a bit too baroque in places due to the way the
> > code had evolved, and I hadn't gone over it to simplify it).
> >
> > The only behavior change is that the renegotiation is requested 1kB
> > before the limit is hit: the raise to 1% of the configured limit was
> > removed.
>
> What basis do we have for thinking that 1kB is definitely enough to
> avoid spurious disconnects?

I noticed that the "count" variable (which is what we use to determine
when to start the renegotiation and eventually kill the connection) is
only incremented when there's successful SSL transmission: it doesn't
count low-level network transmission. If OpenSSL returns a WANT_READ or
WANT_WRITE error code, that variable is not incremented. The number of
bytes returned does not include network data transmitted only to satisfy
the renegotiation.

Sadly, with the OpenSSL codebase, there isn't much documented field
experience to go by. Even something battle-tested such as Apache's
mod_ssl gets this wrong; but apparently they don't care because their
sessions are normally so short-lived that they don't get these problems.

Also, I spent several days trying to understand the OpenSSL codebase to
figure out how this works, and I think there might be bugs in there too,
at least with nonblocking sockets. I wasn't able to reproduce an actual
failure, though. Funnily enough, their own test utilities do not stress
this area too much (at least the ones they include in their release
tarballs).

> (I have a bad feeling that you're going to say something along the
> lines of "well, we tried it a bunch of times, and...".)

Well, I did try a few times and saw no failure :-)

I have heard about processes in production environments that are
restarted periodically to avoid SSL failures which they blame on
renegotiation. Some other guys have ssl_renegotiation_limit=0 because
they know it causes network problems. I suggest we need to get this
patch out there, so that they can test it; and if 1kB turns out not to
be sufficient, we will have field experience including appropriate error
messages on what is actually going on. (Right now, the error messages
we get are complaining about completely the wrong thing.)

I mean, if that 1kB limit is the only quarrel you have with this patch,
I'm happy.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-09-24 16:46:33
Message-ID: CA+TgmobRc9j8=WqpLKjJz=iKuOCMd9H9CsbROxg1O-yy7Na5iA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Tue, Sep 24, 2013 at 12:30 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> I mean, if that 1kB limit is the only quarrel you have with this patch,
> I'm happy.

You should probably be happy, then. :-)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-10-01 13:16:02
Message-ID: 20131001131601.GG5235@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Since back branches releases are getting closer, I would like to push
this to all supported branches. To avoid a compatibility nightmare in
case the new die-on-delayed-renegotiation behavior turns out not to be
so great, I think it would be OK to set the error level to WARNING in
all branches but master (and reset the byte count, to avoid filling the
log). I would also add a CONTEXT line with the current counter value
and the configured limit, and a HINT to report to pg-hackers. That way
we will hopefully have more info on problems in the field.

Anybody opposed to this?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-10-01 14:17:55
Message-ID: CA+Tgmoao24jyZRQA2EJzFd6NJBP5Ot=U+_GArr+_gnxh1LhNRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Tue, Oct 1, 2013 at 9:16 AM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> Since back branches releases are getting closer, I would like to push
> this to all supported branches. To avoid a compatibility nightmare in
> case the new die-on-delayed-renegotiation behavior turns out not to be
> so great, I think it would be OK to set the error level to WARNING in
> all branches but master (and reset the byte count, to avoid filling the
> log). I would also add a CONTEXT line with the current counter value
> and the configured limit, and a HINT to report to pg-hackers. That way
> we will hopefully have more info on problems in the field.
>
> Anybody opposed to this?

Yes, warning suck. If things just failed, users would fix them, but
instead they fill up their hard disk, and then things fail much later,
usually when they are asleep in bed.

If we can't feel comfortable with an ERROR, let's not do it at all.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-10-01 14:19:48
Message-ID: CABUevExUE2GXqtsCDopM9VcR229Pxk=X4r3qNCb2i+JRrk1WLw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Tue, Oct 1, 2013 at 4:17 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Oct 1, 2013 at 9:16 AM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>> Since back branches releases are getting closer, I would like to push
>> this to all supported branches. To avoid a compatibility nightmare in
>> case the new die-on-delayed-renegotiation behavior turns out not to be
>> so great, I think it would be OK to set the error level to WARNING in
>> all branches but master (and reset the byte count, to avoid filling the
>> log). I would also add a CONTEXT line with the current counter value
>> and the configured limit, and a HINT to report to pg-hackers. That way
>> we will hopefully have more info on problems in the field.
>>
>> Anybody opposed to this?
>
> Yes, warning suck. If things just failed, users would fix them, but
> instead they fill up their hard disk, and then things fail much later,
> usually when they are asleep in bed.
>
> If we can't feel comfortable with an ERROR, let's not do it at all.

In principle, I agree.

However, if we want to do this as a temporary measure to judge impact,
we could do WARNING now and flip it to ERROR in the next minor
release.

However, do we think we'll actually *get* any reports in of it if we
do that? As in would we trust the input? If we do, the it might be
worth doing that. If we don't believe we'll get any input from the
WARNINGs anyway, we might as well flip it to an ERROR. But if we do
flip it to an ERROR, we should have a way to disable that error if, as
Alvaro puts it, we end up breaking too many things.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-10-01 14:27:14
Message-ID: CA+TgmobT0hT4gdJ2=YWeMruMTCT2KZE+1QM+pyHp-wRqCLDEXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Tue, Oct 1, 2013 at 10:19 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> If we can't feel comfortable with an ERROR, let's not do it at all.
>
> In principle, I agree.
>
> However, if we want to do this as a temporary measure to judge impact,
> we could do WARNING now and flip it to ERROR in the next minor
> release.

I can't imagine that whacking the behavior around multiple times is
going to please very many people. And, from a practical standpoint,
the time between minor releases is typically on the order of ~3
months. That's not very long. The situations in which trouble occurs
are likely to obscure, and a lot of users don't apply every minor
release, or don't apply them in a timely fashion. So I can't see that
this strategy would increase our confidence very much anyway. In
other words...

> However, do we think we'll actually *get* any reports in of it if we
> do that? As in would we trust the input?

...no.

> If we do, the it might be
> worth doing that. If we don't believe we'll get any input from the
> WARNINGs anyway, we might as well flip it to an ERROR. But if we do
> flip it to an ERROR, we should have a way to disable that error if, as
> Alvaro puts it, we end up breaking too many things.

A way of disabling the error seems like an awfully good idea. Since I
know my audience, I won't suggest the obvious method of accomplishing
that goal, but I think we all know what it is.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-10-01 20:46:12
Message-ID: 20131001204612.GC5408@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 2013-10-01 10:27:14 -0400, Robert Haas wrote:
> On Tue, Oct 1, 2013 at 10:19 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> >> If we can't feel comfortable with an ERROR, let's not do it at all.
> >
> > In principle, I agree.
> >
> > However, if we want to do this as a temporary measure to judge impact,
> > we could do WARNING now and flip it to ERROR in the next minor
> > release.
>
> I can't imagine that whacking the behavior around multiple times is
> going to please very many people.

If we have to whack it around, it's because there's a bug in either our
usage of openssl or in openssl itself. Neither is particularly unlikely,
but it's not like not erroring or warning out will stop that.

The alternate reason for getting the WARNING/ERROR is that there is
somebody MITMing the connection and explicitly corrupting renegotiation
packets. But that's a reason for making it an ERROR immediately, not
making it silent. I think from the security POV it's pretty clear that we
should make it an error.

So, imo the question is why are we uncomfortable making it an ERROR
immediately? I think the most likely reason for problems is users having
configured ssl_renegotiation_limit absurdly low, like less than what the
particular algorithms used actually need. What about clamping it to 1MB
if != 0? We can't make it an actual error in the backbranches...

> > If we do, the it might be
> > worth doing that. If we don't believe we'll get any input from the
> > WARNINGs anyway, we might as well flip it to an ERROR. But if we do
> > flip it to an ERROR, we should have a way to disable that error if, as
> > Alvaro puts it, we end up breaking too many things.
>
> A way of disabling the error seems like an awfully good idea. Since I
> know my audience, I won't suggest the obvious method of accomplishing
> that goal, but I think we all know what it is.

In which scenario would you want to do that? The way to prevent the
ERROR it is to disable renegotiation entirely. And that's already
possible. Anything else seems like papering over security issues.

I guess I am voting for making renegotiation failure an ERROR everywhere
and silently clamp renegotiation_limit to a lower bound of 1MB in the
backbranches while making the (0, 1MB) an error in HEAD.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-10-15 11:35:26
Message-ID: 525D287E.60809@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 09/23/2013 10:51 PM, Alvaro Herrera wrote:
> + /* are we in the middle of a renegotiation? */
> + static bool in_ssl_renegotiation = false;
> +

Since this was committed, I'm getting the following warning:

be-secure.c:105:13: warning: ‘in_ssl_renegotiation’ defined but not used
[-Wunused-variable]

--
Vik


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>
Cc: pgsql-committers(at)postgresql(dot)org, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Rework SSL renegotiation code
Date: 2013-10-17 15:42:46
Message-ID: 20131017154245.GF9746@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Vik Fearing wrote:
> On 09/23/2013 10:51 PM, Alvaro Herrera wrote:
> > + /* are we in the middle of a renegotiation? */
> > + static bool in_ssl_renegotiation = false;
>
> Since this was committed, I'm getting the following warning:
>
> be-secure.c:105:13: warning: ‘in_ssl_renegotiation’ defined but not used
> [-Wunused-variable]

Jaime Casanova wrote:

> Shouldn't this new variable be declared inside the #ifdef USE_SSL block?
> My compiler is giving me a warning for the unused variable

Yep, thanks guys. Just pushed a fix.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-11-14 20:27:33
Message-ID: 20131114202733.GB7583@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

So I committed this patch without backpatching anything. There was some
discussion about the exact strategy for backpatching the behavior
change, but no final agreement; the suggestions were

0. Backpatch as an ERROR. If it causes failures, people is supposed to
change their apps or something.

1. Don't backpatch the ERROR bit at all, so that if the renegotiation
fails we would silently continue just as currently

2. Do spit the message, but only as a WARNING. Thinks this may end up
causing log disks to fill up.

3. Add it as an ERROR, but make it possible to disable it, presumably
via a new GUC. So people can see their security problems and hopefuly
fix them, but if they don't, then they can shut it up via server
configuration. This would entail a GUC variable that exists in existing
branches but not HEAD (we could avoid an upgradability problem of
postgresql.conf files by having a no-op phantom GUC variable in HEAD).

I was reminded of this once more because I just saw a spurious
renegotiation failure in somebody's production setup. Kind of like a
recurring nightmare which I thought I had already erradicated.

Opinions? Also, should we wait longer for the new renegotiation code to
be more battle-tested?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-11-15 14:33:17
Message-ID: 20131115143317.GZ17272@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Alvaro,

* Alvaro Herrera (alvherre(at)2ndquadrant(dot)com) wrote:
> 1. Don't backpatch the ERROR bit at all, so that if the renegotiation
> fails we would silently continue just as currently

I'm leaning towards the above at this point.

> I was reminded of this once more because I just saw a spurious
> renegotiation failure in somebody's production setup. Kind of like a
> recurring nightmare which I thought I had already erradicated.

I saw one yesterday. :(

> Opinions? Also, should we wait longer for the new renegotiation code to
> be more battle-tested?

I've got a better environment to test this in now and given that I saw
it just yesterday, I'm very interested in addressing it. I grow tired
of seeing these renegotiation errors.

Thanks!

Stephen


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-11-15 15:43:23
Message-ID: 22907.1384530203@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> So I committed this patch without backpatching anything. ...
> ... should we wait longer for the new renegotiation code to
> be more battle-tested?

+1 to waiting awhile. I think if we don't see any problems in
HEAD, then back-patching as-is would be the best solution.
The other alternatives are essentially acknowledging that you're
back-patching something you're afraid isn't production ready.
Let's not go there.

Another reason I'm not in a hurry is that the problem we're trying
to solve doesn't seem to be causing real-world trouble. So by
"awhile", I'm thinking "let's let it get through 9.4 beta testing".

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-11-15 15:49:50
Message-ID: 20131115154950.GC5489@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 2013-11-15 10:43:23 -0500, Tom Lane wrote:
> +1 to waiting awhile. I think if we don't see any problems in
> HEAD, then back-patching as-is would be the best solution.
> The other alternatives are essentially acknowledging that you're
> back-patching something you're afraid isn't production ready.
> Let's not go there.

Agreed. Both on just backpatching it unchanged and waiting for the fix
to prove itself a bit.

> Another reason I'm not in a hurry is that the problem we're trying
> to solve doesn't seem to be causing real-world trouble. So by
> "awhile", I'm thinking "let's let it get through 9.4 beta testing".

Well, there have been a bunch of customer complaints about it, afair
that's what made Alvaro look into it in the first place. So it's not a
victimless bug.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-11-15 15:58:19
Message-ID: 23255.1384531099@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2013-11-15 10:43:23 -0500, Tom Lane wrote:
>> Another reason I'm not in a hurry is that the problem we're trying
>> to solve doesn't seem to be causing real-world trouble. So by
>> "awhile", I'm thinking "let's let it get through 9.4 beta testing".

> Well, there have been a bunch of customer complaints about it, afair
> that's what made Alvaro look into it in the first place. So it's not a
> victimless bug.

OK, then maybe end-of-beta is too long. But how much testing will it get
during development? I know I never use SSL on development installs.
How many hackers do?

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-11-15 16:05:29
Message-ID: 20131115160529.GD5489@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 2013-11-15 10:58:19 -0500, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2013-11-15 10:43:23 -0500, Tom Lane wrote:
> >> Another reason I'm not in a hurry is that the problem we're trying
> >> to solve doesn't seem to be causing real-world trouble. So by
> >> "awhile", I'm thinking "let's let it get through 9.4 beta testing".
>
> > Well, there have been a bunch of customer complaints about it, afair
> > that's what made Alvaro look into it in the first place. So it's not a
> > victimless bug.
>
> OK, then maybe end-of-beta is too long. But how much testing will it get
> during development? I know I never use SSL on development installs.
> How many hackers do?

I guess few. And even fewer will actually have connections that live
long enough to experience renegotiations :/.

I wonder how hard it'd be to rig the buildfarm code to generate ssl
certificates and use them during installcheck. If we'd additionally set
a low renegotiation limit...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2013-11-19 13:11:30
Message-ID: CA+TgmoZwU_PZg8NT860yUxo3rCSa=Y-Dn7tOJNxowHsw1XvzsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Fri, Nov 15, 2013 at 10:49 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> Another reason I'm not in a hurry is that the problem we're trying
>> to solve doesn't seem to be causing real-world trouble. So by
>> "awhile", I'm thinking "let's let it get through 9.4 beta testing".
>
> Well, there have been a bunch of customer complaints about it, afair
> that's what made Alvaro look into it in the first place. So it's not a
> victimless bug.

Well, can any of those people try running with this patch? That'd be
a good way of getting some confidence in it.

Generally, I agree that something needs to be back-patched here. But
we don't want to create a situation where we fix some people and break
others, and it's not too obvious that we have a way to get there.
Personally, I favor adding some kind of GUC to control the behavior,
but I'm not exactly sure what the shape of it ought to be.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2014-08-26 03:46:13
Message-ID: 20140826034613.GB6343@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2013-11-15 10:43:23 -0500, Tom Lane wrote:
> >> Another reason I'm not in a hurry is that the problem we're trying
> >> to solve doesn't seem to be causing real-world trouble. So by
> >> "awhile", I'm thinking "let's let it get through 9.4 beta testing".
>
> > Well, there have been a bunch of customer complaints about it, afair
> > that's what made Alvaro look into it in the first place. So it's not a
> > victimless bug.
>
> OK, then maybe end-of-beta is too long. But how much testing will it get
> during development? I know I never use SSL on development installs.
> How many hackers do?

Just a reminder that I intend to backpatch this (and subsequent fixes).
We've gone over two 9.4 betas now. Maybe it'd be a good thing if the
beta3 announcement carried a note about enabling SSL with a low
ssl_renegotiation_limit setting.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Noah Misch <noah(at)leadboat(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSL renegotiation
Date: 2014-08-26 04:35:33
Message-ID: 20140826043533.GA711284@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Mon, Aug 25, 2014 at 11:46:13PM -0400, Alvaro Herrera wrote:
> Tom Lane wrote:
> > OK, then maybe end-of-beta is too long. But how much testing will it get
> > during development? I know I never use SSL on development installs.
> > How many hackers do?
>
> Just a reminder that I intend to backpatch this (and subsequent fixes).
> We've gone over two 9.4 betas now. Maybe it'd be a good thing if the
> beta3 announcement carried a note about enabling SSL with a low
> ssl_renegotiation_limit setting.

To elaborate on my private comments of 2013-10-11, I share Robert's
wariness[1] concerning the magic number of 1024 bytes of renegotiation
headroom. Use of that number predates your work, but your work turned
exhaustion of that headroom into a FATAL error. Situations where the figure
is too small will become disruptive, whereas the problem is nearly invisible
today. Network congestion is a factor, so the lack of complaints during beta
is relatively uninformative. Disabling renegotiation is a quick workaround,
fortunately, but needing to use that workaround will damage users' fragile
faith in the safety of our minor releases.

My recommendation is to either keep this 9.4-only or do focused load testing
to determine the actual worst-case headroom requirement.

[1] http://www.postgresql.org/message-id/CA+TgmoZVGmyZLx7e4ARq_5nu4uDeN7wrvg1xJXg_o9c61CHu3g@mail.gmail.com