Re: pgcrypto related backend crash on solaris 10/x86_64

Lists: pgsql-hackers
From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: pgcrypto related backend crash on solaris 10/x86_64
Date: 2007-09-09 15:14:38
Message-ID: 46E40DDE.1010004@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I brought back clownfish(still a bit dubious about the unexplained
failures which seem vmware emulation bugs but this one seems to be
easily reproduceable) onto the buildfarm and enabled --with-openssl
after the the recent openssl/pgcrypto related fixes but I'm still
getting a backend crash during the pgcrypto regression tests:

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=clownfish&dt=2007-09-09%2012:14:50

backtrace looks like:

program terminated by signal SEGV (no mapping at the fault address)
0xfffffd7fff241b61: AES_encrypt+0x0241: xorq (%r15,%rdx,8),%rbx
(dbx) where
=>[1] AES_encrypt(0x5, 0x39dc9a7a, 0xf560e7b50e, 0x90ca350d49,
0xf560e7b50ea90dfb, 0x6b6b6b6b), at 0xfffffd7fff241b61
[2] 0x0(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x0

Stefan


From: "Marko Kreen" <markokr(at)gmail(dot)com>
To: "Stefan Kaltenbrunner" <stefan(at)kaltenbrunner(dot)cc>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgcrypto related backend crash on solaris 10/x86_64
Date: 2007-09-09 15:52:37
Message-ID: e51f66da0709090852i64fe389dj9c4a6e2de2d8bf0d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 9/9/07, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> wrote:
> I brought back clownfish(still a bit dubious about the unexplained
> failures which seem vmware emulation bugs but this one seems to be
> easily reproduceable) onto the buildfarm and enabled --with-openssl
> after the the recent openssl/pgcrypto related fixes but I'm still
> getting a backend crash during the pgcrypto regression tests:
>
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=clownfish&dt=2007-09-09%2012:14:50
>
>
>
> backtrace looks like:
>
> program terminated by signal SEGV (no mapping at the fault address)
> 0xfffffd7fff241b61: AES_encrypt+0x0241: xorq (%r15,%rdx,8),%rbx
> (dbx) where
> =>[1] AES_encrypt(0x5, 0x39dc9a7a, 0xf560e7b50e, 0x90ca350d49,
> 0xf560e7b50ea90dfb, 0x6b6b6b6b), at 0xfffffd7fff241b61
> [2] 0x0(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x0

This is crashing because of the crippled OpenSSL on some version
of Solaris. Zdenek Kotala posted a workaround for that, I am
cleaning it but have not found the time to finalize it.

I'll try to post v03 of Zdenek's patch ASAP.

--
marko


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Marko Kreen" <markokr(at)gmail(dot)com>
Cc: "Stefan Kaltenbrunner" <stefan(at)kaltenbrunner(dot)cc>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgcrypto related backend crash on solaris 10/x86_64
Date: 2007-09-09 16:36:33
Message-ID: 25367.1189355793@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Marko Kreen" <markokr(at)gmail(dot)com> writes:
> On 9/9/07, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> wrote:
>> I brought back clownfish(still a bit dubious about the unexplained
>> failures which seem vmware emulation bugs but this one seems to be
>> easily reproduceable) onto the buildfarm and enabled --with-openssl
>> after the the recent openssl/pgcrypto related fixes but I'm still
>> getting a backend crash during the pgcrypto regression tests:

> This is crashing because of the crippled OpenSSL on some version
> of Solaris. Zdenek Kotala posted a workaround for that, I am
> cleaning it but have not found the time to finalize it.

But clownfish was working fine up through Aug 2, and the only change in
pgcrypto since then could hardly have introduced this failure:
http://archives.postgresql.org/pgsql-committers/2007-08/msg00306.php

So I think there's more to it than Marko's explanation. Maybe clownfish
now has a different OpenSSL version installed than before?

regards, tom lane


From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgcrypto related backend crash on solaris 10/x86_64
Date: 2007-09-09 16:59:21
Message-ID: 46E42669.9070800@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> "Marko Kreen" <markokr(at)gmail(dot)com> writes:
>> On 9/9/07, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> wrote:
>>> I brought back clownfish(still a bit dubious about the unexplained
>>> failures which seem vmware emulation bugs but this one seems to be
>>> easily reproduceable) onto the buildfarm and enabled --with-openssl
>>> after the the recent openssl/pgcrypto related fixes but I'm still
>>> getting a backend crash during the pgcrypto regression tests:
>
>> This is crashing because of the crippled OpenSSL on some version
>> of Solaris. Zdenek Kotala posted a workaround for that, I am
>> cleaning it but have not found the time to finalize it.
>
> But clownfish was working fine up through Aug 2, and the only change in
> pgcrypto since then could hardly have introduced this failure:
> http://archives.postgresql.org/pgsql-committers/2007-08/msg00306.php
>
> So I think there's more to it than Marko's explanation. Maybe clownfish
> now has a different OpenSSL version installed than before?

no clownfish was not building with openssl before because of that
"crippled openssl" issue - I was under the assumption that the above
commit was actually incorporating the complete fix from zdenek so I
added it back again only to find that it is still not working ...

Stefan


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgcrypto related backend crash on solaris 10/x86_64
Date: 2007-09-11 11:29:17
Message-ID: 46E67C0D.3090700@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen wrote:
> On 9/9/07, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> wrote:
>> I brought back clownfish(still a bit dubious about the unexplained
>> failures which seem vmware emulation bugs but this one seems to be
>> easily reproduceable) onto the buildfarm and enabled --with-openssl
>> after the the recent openssl/pgcrypto related fixes but I'm still
>> getting a backend crash during the pgcrypto regression tests:
>>
>> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=clownfish&dt=2007-09-09%2012:14:50
>>
>>
>>
>> backtrace looks like:
>>
>> program terminated by signal SEGV (no mapping at the fault address)
>> 0xfffffd7fff241b61: AES_encrypt+0x0241: xorq (%r15,%rdx,8),%rbx
>> (dbx) where
>> =>[1] AES_encrypt(0x5, 0x39dc9a7a, 0xf560e7b50e, 0x90ca350d49,
>> 0xf560e7b50ea90dfb, 0x6b6b6b6b), at 0xfffffd7fff241b61
>> [2] 0x0(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x0
>
> This is crashing because of the crippled OpenSSL on some version
> of Solaris. Zdenek Kotala posted a workaround for that, I am
> cleaning it but have not found the time to finalize it.
>
> I'll try to post v03 of Zdenek's patch ASAP.
>

However, I guess there still will be a problem with regression tests,
because pg_crypto will reports error in case when user tries to use
stronger cipher, but it generates diff between expected and real output.

I don't know if is possible select different output based on test if
strong crypto is installed or not. Maybe some magic in
Makefile/Configure. Test should be:

# ldd /usr/postgres/8.2/lib/pgcrypto.so | grep libcrypto_extra
# libcrypto_extra.so.0.9.8 => (file not found)

if output contains (file not found) library is not installed or not in
path (/usr/sfw/lib).

Zdenek


From: "Marko Kreen" <markokr(at)gmail(dot)com>
To: "Zdenek Kotala" <Zdenek(dot)Kotala(at)sun(dot)com>
Cc: "Stefan Kaltenbrunner" <stefan(at)kaltenbrunner(dot)cc>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgcrypto related backend crash on solaris 10/x86_64
Date: 2007-09-11 14:28:27
Message-ID: e51f66da0709110728i7a5f60d6q27827bec44af32f5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 9/11/07, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com> wrote:
> Marko Kreen wrote:
> > This is crashing because of the crippled OpenSSL on some version
> > of Solaris. Zdenek Kotala posted a workaround for that, I am
> > cleaning it but have not found the time to finalize it.
> >
> > I'll try to post v03 of Zdenek's patch ASAP.

> However, I guess there still will be a problem with regression tests,
> because pg_crypto will reports error in case when user tries to use
> stronger cipher, but it generates diff between expected and real output.
>
> I don't know if is possible select different output based on test if
> strong crypto is installed or not. Maybe some magic in
> Makefile/Configure. Test should be:
>
> # ldd /usr/postgres/8.2/lib/pgcrypto.so | grep libcrypto_extra
> # libcrypto_extra.so.0.9.8 => (file not found)
>
> if output contains (file not found) library is not installed or not in
> path (/usr/sfw/lib).

Failing regression tests are fine - it is good if user can
easily see that the os is broken.

--
marko


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgcrypto related backend crash on solaris 10/x86_64
Date: 2007-09-11 15:13:15
Message-ID: 46E6B08B.1030703@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Marko Kreen wrote:
> On 9/11/07, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com> wrote:
>> Marko Kreen wrote:
>>> This is crashing because of the crippled OpenSSL on some version
>>> of Solaris. Zdenek Kotala posted a workaround for that, I am
>>> cleaning it but have not found the time to finalize it.
>>>
>>> I'll try to post v03 of Zdenek's patch ASAP.
>
>> However, I guess there still will be a problem with regression tests,
>> because pg_crypto will reports error in case when user tries to use
>> stronger cipher, but it generates diff between expected and real output.
>>
>> I don't know if is possible select different output based on test if
>> strong crypto is installed or not. Maybe some magic in
>> Makefile/Configure. Test should be:
>>
>> # ldd /usr/postgres/8.2/lib/pgcrypto.so | grep libcrypto_extra
>> # libcrypto_extra.so.0.9.8 => (file not found)
>>
>> if output contains (file not found) library is not installed or not in
>> path (/usr/sfw/lib).
>
> Failing regression tests are fine - it is good if user can
> easily see that the os is broken.
>

But if build machine still complain about problem we can easily
overlook another problems. There are two possible solution 1) modify reg
test or 2) recommend to install crypto package on all affected build
machine.

Anyway I plan to add some mention into solaris FAQ when we will have
final patch. I also think It should be good to mention in pg_crypto
README or add comment into regression test expected output file which
will be visible in regression.diff.

Zdenek


From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgcrypto related backend crash on solaris 10/x86_64
Date: 2007-09-12 07:54:04
Message-ID: 46E79B1C.8040002@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zdenek Kotala wrote:
> Marko Kreen wrote:
>> On 9/11/07, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com> wrote:
>>> Marko Kreen wrote:
>>>> This is crashing because of the crippled OpenSSL on some version
>>>> of Solaris. Zdenek Kotala posted a workaround for that, I am
>>>> cleaning it but have not found the time to finalize it.
>>>>
>>>> I'll try to post v03 of Zdenek's patch ASAP.
>>
>>> However, I guess there still will be a problem with regression tests,
>>> because pg_crypto will reports error in case when user tries to use
>>> stronger cipher, but it generates diff between expected and real output.
>>>
>>> I don't know if is possible select different output based on test if
>>> strong crypto is installed or not. Maybe some magic in
>>> Makefile/Configure. Test should be:
>>>
>>> # ldd /usr/postgres/8.2/lib/pgcrypto.so | grep libcrypto_extra
>>> # libcrypto_extra.so.0.9.8 => (file not found)
>>>
>>> if output contains (file not found) library is not installed or not in
>>> path (/usr/sfw/lib).
>>
>> Failing regression tests are fine - it is good if user can
>> easily see that the os is broken.
>>
>
> But if build machine still complain about problem we can easily
> overlook another problems. There are two possible solution 1) modify reg
> test or 2) recommend to install crypto package on all affected build
> machine.
>
> Anyway I plan to add some mention into solaris FAQ when we will have
> final patch. I also think It should be good to mention in pg_crypto
> README or add comment into regression test expected output file which
> will be visible in regression.diff.

well in my opinion we should simply fail regression(not crash like we do
now) in case we have to deal with such a crippled openssl installation.
Adding information about that issue to the Solaris FAQ seems also like a
good thing.

Stefan