can't load plpython

Lists: pgsql-hackers
From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: can't load plpython
Date: 2009-03-31 00:34:50
Message-ID: 20090331003450.GN23023@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

So I've been trying to get a plpython function that removes accented
letters, based on a Python snippet posted on another thread. The
function is simple enough:

create or replace function unaccent(text) returns text language plpythonu as $$
import unicodedata
s = unicodedata.normalize("NFKD", args[0])
s = ''.join(c for c in s if ord(c) < 127)
return s
$$ ;

However, on HEAD this is crashing for me, and it's right when plpython
loads. Backtrace below.

I already distclean'ed, initdb'd, rebuilt the whole thing from scratch
and I can't make it work. This is on Python 2.5.4, Debian unstable
stuff.

On 8.3 it just fails thusly:
alvherre=# select unaccent('álvaro muñoz');
ERROR: plpython: function "unaccent" failed
DETALLE: <type 'exceptions.TypeError'>: normalize() argument 2 must be unicode, not str

Obviously I don't know Python to fix it :-)

#0 dl_open_worker (a=<value optimized out>) at dl-open.c:369
#1 0x00007f6b8bba9436 in _dl_catch_error (objname=0x7fff93db7950, errstring=0x7fff93db7948,
mallocedp=0x7fff93db795f, operate=0x7f6b8bbad780 <dl_open_worker>, args=0x7fff93db7900)
at dl-error.c:178
#2 0x00007f6b8bbad2ab in _dl_open (
file=0x1349980 "/home/alvherre/Code/CVS/pgsql/install/00head/lib/plpython.so",
mode=-2147483390, caller_dlopen=0x78f1ba, nsid=-2, argc=1, argv=0x7fff93db8c08, env=0x127ceb0)
at dl-open.c:596
#3 0x00007f6b8b04ef5b in dlopen_doit (a=<value optimized out>) at dlopen.c:67
#4 0x00007f6b8bba9436 in _dl_catch_error (objname=0x7f6b8b2510d0, errstring=0x7f6b8b2510d8,
mallocedp=0x7f6b8b2510c8, operate=0x7f6b8b04eef0 <dlopen_doit>, args=0x7fff93db7b20)
at dl-error.c:178
#5 0x00007f6b8b04f30c in _dlerror_run (operate=0x7f6b8b04eef0 <dlopen_doit>, args=0x7fff93db7b20)
at dlerror.c:164
#6 0x00007f6b8b04eec1 in __dlopen (file=<value optimized out>, mode=<value optimized out>)
at dlopen.c:88
#7 0x000000000078f1ba in internal_load_library (
libname=0x13762a0 "/home/alvherre/Code/CVS/pgsql/install/00head/lib/plpython.so")
at /pgsql/source/00head/src/backend/utils/fmgr/dfmgr.c:234
#8 0x000000000078ee6a in load_external_function (filename=0x1376268 "$libdir/plpython",
funcname=0x13721a8 "plpython_call_handler", signalNotFound=1 '\001', filehandle=0x7fff93db7d08)
at /pgsql/source/00head/src/backend/utils/fmgr/dfmgr.c:113
#9 0x0000000000790668 in fmgr_info_C_lang (functionId=16393, finfo=0x7fff93db7e60,
procedureTuple=0x7f6b8bd085c0) at /pgsql/source/00head/src/backend/utils/fmgr/fmgr.c:345
#10 0x00000000007904e1 in fmgr_info_cxt_security (functionId=16393, finfo=0x7fff93db7e60,
mcxt=0x13478b8, ignore_security=0 '\0')
at /pgsql/source/00head/src/backend/utils/fmgr/fmgr.c:276
#11 0x000000000079022a in fmgr_info_cxt (functionId=16393, finfo=0x7fff93db7e60, mcxt=0x13478b8)
at /pgsql/source/00head/src/backend/utils/fmgr/fmgr.c:166
#12 0x0000000000790200 in fmgr_info (functionId=16393, finfo=0x7fff93db7e60)
at /pgsql/source/00head/src/backend/utils/fmgr/fmgr.c:156

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: can't load plpython
Date: 2009-03-31 03:04:00
Message-ID: 29215.1238468640@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> ... However, on HEAD this is crashing for me, and it's right when plpython
> loads. Backtrace below.

Does plpython pass its regression tests for you (I'd suppose not)?

For me on Fedora 10 x86_64, CVS HEAD plus python 2.5.2 passes regression
but the given example still dumps core. postmaster log says

postgres: tgl regression [local] SELECT: Objects/stringobject.c:107: PyString_FromString: Assertion `str != ((void *)0)' failed.
LOG: server process (PID 4714) was terminated by signal 6: Aborted
LOG: terminating any other active server processes

backtrace

#0 0x0000003e1d032f05 in raise () from /lib64/libc.so.6
#1 0x0000003e1d034a73 in abort () from /lib64/libc.so.6
#2 0x0000003e1d02bef9 in __assert_fail () from /lib64/libc.so.6
#3 0x0000003e3367e67c in PyString_FromString ()
from /usr/lib64/libpython2.5.so.1.0
#4 0x0000003e3366ec26 in PyDict_SetItemString ()
from /usr/lib64/libpython2.5.so.1.0
#5 0x0000000000b2149d in PLy_function_build_args (fcinfo=0x7fffff0adb80,
proc=0x2685c80) at plpython.c:1055
#6 0x0000000000b2281e in PLy_function_handler (fcinfo=0x7fffff0adb80,
proc=0x2685c80) at plpython.c:795
#7 0x0000000000b230f6 in plpython_call_handler (fcinfo=0x7fffff0adb80)
at plpython.c:356
#8 0x000000000056009a in ExecMakeFunctionResult (fcache=0x267c0f0,
econtext=0x267bfc0, isNull=0x267cb38 "", isDone=0x267cbf0)
at execQual.c:1665
#9 0x000000000055af44 in ExecTargetList () at execQual.c:5001
#10 ExecProject (projInfo=<value optimized out>, isDone=0x7fffff0ae06c)
at execQual.c:5202
#11 0x000000000056f289 in ExecResult (node=0x267bea8) at nodeResult.c:155
#12 0x000000000055a27d in ExecProcNode (node=0x267bea8) at execProcnode.c:344
#13 0x0000000000557cca in ExecutePlan () at execMain.c:1504

> Obviously I don't know Python to fix it :-)

Me either. Something is pretty bad in python-land, it seems.

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: can't load plpython
Date: 2009-03-31 04:14:22
Message-ID: 20090331041422.GO23023@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> > ... However, on HEAD this is crashing for me, and it's right when plpython
> > loads. Backtrace below.
>
> Does plpython pass its regression tests for you (I'd suppose not)?

Doh. Silly me. It does pass the regression tests, all six of them. I
guess it's trying to load the unicode stuff that it crashes, not
plpython itself ...

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: can't load plpython
Date: 2009-03-31 04:22:55
Message-ID: 508.1238473375@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Tom Lane wrote:
>> Does plpython pass its regression tests for you (I'd suppose not)?

> Doh. Silly me. It does pass the regression tests, all six of them. I
> guess it's trying to load the unicode stuff that it crashes, not
> plpython itself ...

Hm, maybe we weren't testing quite the same scenario. What locale
and database_encoding were you using? I tried C/SQL_ASCII and
C/UTF8 and got the same result both ways, but obviously that's not
covering much territory.

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: can't load plpython
Date: 2009-03-31 13:03:32
Message-ID: 20090331130332.GP23023@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> > Tom Lane wrote:
> >> Does plpython pass its regression tests for you (I'd suppose not)?
>
> > Doh. Silly me. It does pass the regression tests, all six of them. I
> > guess it's trying to load the unicode stuff that it crashes, not
> > plpython itself ...
>
> Hm, maybe we weren't testing quite the same scenario. What locale
> and database_encoding were you using? I tried C/SQL_ASCII and
> C/UTF8 and got the same result both ways, but obviously that's not
> covering much territory.

I'm on es_CL.UTF-8. I just tried on C/SQL_ASCII and the regression
tests pass there too (and the function still crashes).

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: can't load plpython
Date: 2009-03-31 16:55:02
Message-ID: 49D24AE6.4020105@timbira.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane escreveu:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
>> ... However, on HEAD this is crashing for me, and it's right when plpython
>> loads. Backtrace below.
>
> Does plpython pass its regression tests for you (I'd suppose not)?
>
> For me on Fedora 10 x86_64, CVS HEAD plus python 2.5.2 passes regression
> but the given example still dumps core. postmaster log says
>
> postgres: tgl regression [local] SELECT: Objects/stringobject.c:107: PyString_FromString: Assertion `str != ((void *)0)' failed.
> LOG: server process (PID 4714) was terminated by signal 6: Aborted
> LOG: terminating any other active server processes
>
PyString_FromString() [1] fails to return something useful, i.e, null pointer
when its argument is null. The trivial fix (that is attached) is to ensure
that we don't pass a null pointer as the second argument of
PyDict_SetItemString(). Of course, it's a Python bug and I filled it [3].

>> Obviously I don't know Python to fix it :-)
>
> Me either. Something is pretty bad in python-land, it seems.
>
Me either. ;)

[1]
http://svn.python.org/view/python/trunk/Objects/stringobject.c?revision=70682&view=markup
[2]
http://svn.python.org/view/python/trunk/Objects/dictobject.c?revision=70550&view=markup
[3] http://bugs.python.org/issue5627

--
Euler Taveira de Oliveira
http://www.timbira.com/

Attachment Content-Type Size
py.diff text/plain 982 bytes

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: can't load plpython
Date: 2009-03-31 17:14:20
Message-ID: 20090331171420.GY23023@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Euler Taveira de Oliveira wrote:
> Tom Lane escreveu:
> > Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> >> ... However, on HEAD this is crashing for me, and it's right when plpython
> >> loads. Backtrace below.
> >
> > Does plpython pass its regression tests for you (I'd suppose not)?
> >
> > For me on Fedora 10 x86_64, CVS HEAD plus python 2.5.2 passes regression
> > but the given example still dumps core. postmaster log says
> >
> > postgres: tgl regression [local] SELECT: Objects/stringobject.c:107: PyString_FromString: Assertion `str != ((void *)0)' failed.
> > LOG: server process (PID 4714) was terminated by signal 6: Aborted
> > LOG: terminating any other active server processes
> >
> PyString_FromString() [1] fails to return something useful, i.e, null pointer
> when its argument is null. The trivial fix (that is attached) is to ensure
> that we don't pass a null pointer as the second argument of
> PyDict_SetItemString(). Of course, it's a Python bug and I filled it [3].

I'm not sure I'm reading this right, but isn't this preventing a
plpytHon function to work if parameters don't have names assigned?
i.e. apparently I can't just use args[0]. I'm sure I'm wrong on this ...?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: can't load plpython
Date: 2009-03-31 19:02:27
Message-ID: 49D268C3.8060503@timbira.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera escreveu:
> Euler Taveira de Oliveira wrote:
>> Tom Lane escreveu:
>>> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
>>>> ... However, on HEAD this is crashing for me, and it's right when plpython
>>>> loads. Backtrace below.
>>> Does plpython pass its regression tests for you (I'd suppose not)?
>>>
>>> For me on Fedora 10 x86_64, CVS HEAD plus python 2.5.2 passes regression
>>> but the given example still dumps core. postmaster log says
>>>
>>> postgres: tgl regression [local] SELECT: Objects/stringobject.c:107: PyString_FromString: Assertion `str != ((void *)0)' failed.
>>> LOG: server process (PID 4714) was terminated by signal 6: Aborted
>>> LOG: terminating any other active server processes
>>>
>> PyString_FromString() [1] fails to return something useful, i.e, null pointer
>> when its argument is null. The trivial fix (that is attached) is to ensure
>> that we don't pass a null pointer as the second argument of
>> PyDict_SetItemString(). Of course, it's a Python bug and I filled it [3].
>
> I'm not sure I'm reading this right, but isn't this preventing a
> plpytHon function to work if parameters don't have names assigned?
No. See the proc->argnames test before PyDict_SetItemString(). The other test
is just tightening the check.

Indeed, the PyDict_*ItemString() functions suffer from the same disease. :( I
reported upstream too.

Attached is another patch that add another test before PyDict_DelItemString();
it's safe because if we don't have a key we don't know what to remove.

Here is my test case (I'm not a python programmer, sorry!).

euler(at)harman $ cat /tmp/{f,g}.sql
create or replace function unaccent(text) returns text language plpythonu as $$
import unicodedata
s = unicodedata.normalize("NFKD", args[0])
s = ''.join(c for c in s if ord(c) < 127)
return s
$$ ;
drop function add(int, int);
drop function add2(int, int);

create or replace function add(a int, b int) returns int language plpythonu as $$
return a + b
$$ ;

create or replace function add2(int, int) returns int language plpythonu as $$
return args[0] + args[1]
$$ ;

euler(at)harman $ ./install/bin/psql
psql (8.4devel)
Type "help" for help.

euler=# select unaccent('até');
NOTA: PL/Python: args[0]: (null)
ERRO: PL/Python: PL/Python function "unaccent" failed
DETALHE: <type 'exceptions.TypeError'>: normalize() argument 2 must be
unicode, not str
euler=# select add(1,2);
NOTA: PL/Python: args[0]: a
NOTA: PL/Python: args[1]: b
NOTA: PL/Python: args[0]: a
NOTA: PL/Python: args[1]: b
add
-----
3
(1 registro)

euler=# select add2(1,2);
NOTA: PL/Python: args[0]: (null)
NOTA: PL/Python: args[1]: (null)
NOTA: PL/Python: args[0]: (null)
NOTA: PL/Python: args[1]: (null)
add2
------
3
(1 registro)

--
Euler Taveira de Oliveira
http://www.timbira.com/

Attachment Content-Type Size
py2.diff text/plain 1.3 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: can't load plpython
Date: 2009-04-03 17:05:14
Message-ID: 17642.1238778314@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Euler Taveira de Oliveira <euler(at)timbira(dot)com> writes:
> Alvaro Herrera escreveu:
>> I'm not sure I'm reading this right, but isn't this preventing a
>> plpytHon function to work if parameters don't have names assigned?

> No. See the proc->argnames test before PyDict_SetItemString(). The other test
> is just tightening the check.

> Indeed, the PyDict_*ItemString() functions suffer from the same disease. :( I
> reported upstream too.

> Attached is another patch that add another test before PyDict_DelItemString();
> it's safe because if we don't have a key we don't know what to remove.

Applied, thanks, along with a regression test case. As far as I can
tell, plpython functions that have no names given for their parameters
have been broken for months, and we did not notice because whoever
added named-parameter support changed *every single* test case to use
only named parameters. Brilliant.

Alvaro's example now gives me this on Fedora 10:

ERROR: PL/Python: PL/Python function "unaccent" failed
DETAIL: <type 'exceptions.TypeError'>: normalize() argument 2 must be unicode, not str

which is the same as it did in 8.3. I do not know if that's a bug
or expected (making the database encoding be utf8 doesn't help).

Alvaro, would you see if it still crashes for you on Debian?
If so there's some other issue with python 2.5.4 ...

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: can't load plpython
Date: 2009-04-03 18:00:36
Message-ID: 20090403180036.GH23023@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

> Alvaro's example now gives me this on Fedora 10:
>
> ERROR: PL/Python: PL/Python function "unaccent" failed
> DETAIL: <type 'exceptions.TypeError'>: normalize() argument 2 must be unicode, not str
>
> which is the same as it did in 8.3. I do not know if that's a bug
> or expected (making the database encoding be utf8 doesn't help).

Apparently the problem is that "str" is a different type in Python than
"unicode". I could get it to work this way:

create or replace function unaccent(text) returns text language plpythonu as $$
import unicodedata
rv = plpy.execute("select setting from pg_settings where name = 'server_encoding'");
encoding = rv[0]["setting"]
s = args[0].decode(encoding)
s = unicodedata.normalize("NFKD", s)
s = ''.join(c for c in s if ord(c) < 127)
return s
$$;

> Alvaro, would you see if it still crashes for you on Debian?
> If so there's some other issue with python 2.5.4 ...

It works for me now. Thanks to Euler for tracking the Python problem
down and to you for the commit!

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: can't load plpython
Date: 2009-04-03 18:49:19
Message-ID: 19903.1238784559@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Tom Lane wrote:
>> Alvaro, would you see if it still crashes for you on Debian?
>> If so there's some other issue with python 2.5.4 ...

> It works for me now. Thanks to Euler for tracking the Python problem
> down and to you for the commit!

Hmph. I wonder what caused that crash you reported originally? The
backtrace doesn't look like it's explained by the argument-name bug:
http://archives.postgresql.org/pgsql-hackers/2009-03/msg01344.php

Maybe that backtrace is just bogus, though --- if you'd pointed gdb
at the wrong executable version, or something, you could have come
up with silly results. Anyway, if it's no longer reproducible, we
probably shouldn't spend too much time on it.

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: can't load plpython
Date: 2009-04-03 19:01:29
Message-ID: 20090403190129.GI23023@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:

> > It works for me now. Thanks to Euler for tracking the Python problem
> > down and to you for the commit!
>
> Hmph. I wonder what caused that crash you reported originally? The
> backtrace doesn't look like it's explained by the argument-name bug:
> http://archives.postgresql.org/pgsql-hackers/2009-03/msg01344.php
>
> Maybe that backtrace is just bogus, though --- if you'd pointed gdb
> at the wrong executable version, or something, you could have come
> up with silly results.

No, the backtrace is right -- I get the same if I revert the plpython.c
commit. I have no idea why the backtrace looks like this. It's even
compiled with -O0.

> Anyway, if it's no longer reproducible, we probably shouldn't spend
> too much time on it.

Okay.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support