Re: [hensa22@yahoo.es: Re: [pgsql-es-ayuda] SLL error 100% cpu]

Lists: pgsql-hackers
From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: [hensa22@yahoo.es: Re: [pgsql-es-ayuda] SLL error 100% cpu]
Date: 2007-12-12 02:39:26
Message-ID: 20071212023926.GC8302@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Here's another problem report on Windows. This time it is usage of SSL
connections and NOTIFY. I talked to Magnus on IRC and he directed me to
bug #2829:
http://archives.postgresql.org/pgsql-bugs/2006-12/msg00122.php

This report seems to be a little different, if only because the reported
error string from SSL mentions an "Unknown winsock error 10004".

This guy is using 8.2.5. SSL seems to be able to fill his log files at
full speed.

Is this an issue we can do something about?

----- Forwarded message from Henry <hensa22(at)yahoo(dot)es> -----

From: Henry <hensa22(at)yahoo(dot)es>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Postgres <pgsql-es-ayuda(at)postgresql(dot)org>
Date: Wed, 12 Dec 2007 03:34:04 +0100 (CET)
Subject: Re: [pgsql-es-ayuda] SLL error 100% cpu
Message-ID: <744138(dot)71684(dot)qm(at)web30802(dot)mail(dot)mud(dot)yahoo(dot)com>

--- Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> escribió:

> Henry escribió:
> > buenas a todos los listeros.
> >
> > ya puse a produccion SSL con postgresql, y la
> > performance se va degradando mientras se va
> usando,
> > procesos de CPU ocupa el 100% y cuando bajo el
> > Servicio quedan alguno postgres.exe colgados,
> > desactive la escritura de Log, porque se creaban
> > demasiados archivos log con el texto de SYSCALL
> > ERROR............... , que raro pero hasta se creo
> un
> > archivo de 14MB (ke raro, si esta configurado
> hasta
> > 10MB solamente).

---------------------------------
> Puedes mandar un extracto de ese archivo gigante?
> Unas cuantas lineas
> de ese SYSCALL ERROR.
----------------------------------

aqui esta:
LOG: SSL SYSCALL error: Unknown winsock error 10004

saludos


______________________________________________
¿Chef por primera vez?
Sé un mejor Cocinillas.
http://es.answers.yahoo.com/info/welcome

----- End forwarded message -----

--
Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC
<Schwern> It does it in a really, really complicated way
<crab> why does it need to be complicated?
<Schwern> Because it's MakeMaker.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [hensa22@yahoo.es: Re: [pgsql-es-ayuda] SLL error 100% cpu]
Date: 2007-12-12 03:32:07
Message-ID: 11120.1197430327@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> This guy is using 8.2.5. SSL seems to be able to fill his log files at
> full speed.

Are you *sure* the server is 8.2.5? 8.2.5 shouldn't emit duplicate
messages, but 8.2.4 and before would:

2007-05-17 21:20 tgl

* src/backend/libpq/: be-secure.c (REL7_4_STABLE), be-secure.c
(REL8_1_STABLE), be-secure.c (REL8_0_STABLE), be-secure.c
(REL8_2_STABLE), be-secure.c: Remove redundant logging of send
failures when SSL is in use. While pqcomm.c had been taught not to
do that ages ago, the SSL code was helpfully bleating anyway.
Resolves some recent reports such as bug #3266; however the
underlying cause of the related bug #2829 is still unclear.

Furthermore, it looks to me like "SSL SYSCALL error: %m" doesn't
exist anymore since that patch, so my bogometer is buzzing loudly.

I dunno anything about how to fix the real problem (what's winsock error
10004?), but I don't think he'd be seeing full speed log filling in
8.2.5.

regards, tom lane


From: "Trevor Talbot" <quension(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Alvaro Herrera" <alvherre(at)alvh(dot)no-ip(dot)org>, Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [hensa22@yahoo.es: Re: [pgsql-es-ayuda] SLL error 100% cpu]
Date: 2007-12-12 05:13:37
Message-ID: 90bce5730712112113hdcad1d3x44d59f5f116ed6b5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/11/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:

> I dunno anything about how to fix the real problem (what's winsock error
> 10004?), but I don't think he'd be seeing full speed log filling in
> 8.2.5.

WSAEINTR, "A blocking operation was interrupted by a call to
WSACancelBlockingCall."

Offhand I'd take it as either not entirely sane usage of a network
API, or one of the so very many broken software firewalls / network
security products.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Trevor Talbot" <quension(at)gmail(dot)com>
Cc: "Alvaro Herrera" <alvherre(at)alvh(dot)no-ip(dot)org>, Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [hensa22@yahoo.es: Re: [pgsql-es-ayuda] SLL error 100% cpu]
Date: 2007-12-12 05:30:50
Message-ID: 12357.1197437450@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Trevor Talbot" <quension(at)gmail(dot)com> writes:
> On 12/11/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I dunno anything about how to fix the real problem (what's winsock error
>> 10004?),

> WSAEINTR, "A blocking operation was interrupted by a call to
> WSACancelBlockingCall."

Oh, then it's exactly the same thing as our bug #2829.

I opined in that thread that OpenSSL was broken because it failed to
treat this as a retryable case like EINTR. But not being much of a
Windows person, that might be mere hot air. Someone with a Windows
build environment should try patching OpenSSL to treat WSAEINTR
the same as Unix EINTR and see what happens ...

regards, tom lane


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Trevor Talbot <quension(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [hensa22@yahoo.es: Re: [pgsql-es-ayuda] SLL error 100% cpu]
Date: 2007-12-12 09:55:49
Message-ID: 20071212095549.GF11269@svr2.hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Dec 12, 2007 at 12:30:50AM -0500, Tom Lane wrote:
> "Trevor Talbot" <quension(at)gmail(dot)com> writes:
> > On 12/11/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> I dunno anything about how to fix the real problem (what's winsock error
> >> 10004?),
>
> > WSAEINTR, "A blocking operation was interrupted by a call to
> > WSACancelBlockingCall."
>
> Oh, then it's exactly the same thing as our bug #2829.
>
> I opined in that thread that OpenSSL was broken because it failed to
> treat this as a retryable case like EINTR. But not being much of a
> Windows person, that might be mere hot air. Someone with a Windows
> build environment should try patching OpenSSL to treat WSAEINTR
> the same as Unix EINTR and see what happens ...

When I last looked at this (and this was some time ago), I suspected (and
still do) that the problem is in the interaction between our
socket-emulation-stuff (for signals) and openssl. I'm not entirely sure,
but I wanted to rewrite the SSL code so that *our* code is responsible for
aclling the actuall send()/recv(), and not OpenSSL. This would also fix the
fact that if an OpenSSL network operation ends up blocking, that process
can't receive any signals...

I didn't have time to get this done before feature-freeze though, and I
beleive the changes are large enough to qualify as such..

//Magnus


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Trevor Talbot <quension(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [hensa22@yahoo.es: Re: [pgsql-es-ayuda] SLL error 100% cpu]
Date: 2008-03-21 19:34:22
Message-ID: 200803211934.m2LJYMT04196@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Added to TODO:

o Prevent SSL from sending network packets to avoid interference
with Win32 signal emulation

http://archives.postgresql.org/pgsql-hackers/2007-12/msg00455.php

---------------------------------------------------------------------------

Magnus Hagander wrote:
> On Wed, Dec 12, 2007 at 12:30:50AM -0500, Tom Lane wrote:
> > "Trevor Talbot" <quension(at)gmail(dot)com> writes:
> > > On 12/11/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > >> I dunno anything about how to fix the real problem (what's winsock error
> > >> 10004?),
> >
> > > WSAEINTR, "A blocking operation was interrupted by a call to
> > > WSACancelBlockingCall."
> >
> > Oh, then it's exactly the same thing as our bug #2829.
> >
> > I opined in that thread that OpenSSL was broken because it failed to
> > treat this as a retryable case like EINTR. But not being much of a
> > Windows person, that might be mere hot air. Someone with a Windows
> > build environment should try patching OpenSSL to treat WSAEINTR
> > the same as Unix EINTR and see what happens ...
>
> When I last looked at this (and this was some time ago), I suspected (and
> still do) that the problem is in the interaction between our
> socket-emulation-stuff (for signals) and openssl. I'm not entirely sure,
> but I wanted to rewrite the SSL code so that *our* code is responsible for
> aclling the actuall send()/recv(), and not OpenSSL. This would also fix the
> fact that if an OpenSSL network operation ends up blocking, that process
> can't receive any signals...
>
> I didn't have time to get this done before feature-freeze though, and I
> beleive the changes are large enough to qualify as such..
>
> //Magnus
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +