pl/python refactoring

Lists: pgsql-hackers
From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: pl/python refactoring
Date: 2010-12-23 13:41:04
Message-ID: 4D135170.3080705@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Here's the base refactoring patch mentioned in
http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php

Git branch for this patch: https://github.com/wulczer/postgres/tree/refactor

It does some architectural changes to PL/Python that make it easier to
implement other features, like for instance a validator function. The
full list of changes in the patch is:

* Use HTABs keyed with OIDs for procedure and trigger caches. Using
Python dictionaries for these caches seemes like using the wrong tool,
and then you get better error reporting and memory management with PG HTABs.

* Get rid of global error state. It was wrong, 'nuff said.

* Fix an error when the iterator was left half-open after an error.

* Factor out PLy_procedure_{input,output}_conversion functions.

* Use palloc in TopMemoryContext instead of malloc. As discussed in
http://archives.postgresql.org/pgsql-hackers/2010-11/msg01857.php

* Use PyObject_New instead of PyObject_NEW (which is undocumented).

* Reset exception state after catching an error from SPI calls. Doing
SPI calls in subtransactions will remove the need to do that altogether,
so this little wart will go away soon.

* Correctly put the exceptions in the plpy module. Previous coding
failed for Python 3 (the exceptions were not available on Python 3 from
the plpy module, see the new test in plpython_test.sql and also
http://archives.postgresql.org/message-id/4D0CFE5C.5060704@wulczer.org).

* Define all fields of the PyModuleDef structure for Python 3 (see
http://docs.python.org/py3k/c-api/module.html#PyModuleDef). Not sure why
this worked at all, I guess by some lucky coincidence the undefined
fields were falling in an accessible memory region filled with zeroes...

* Raise normal exceptions from plpy.error() instead of relying on the
global error state.

* Do not prefix error messages with "PL/Python: ". It's redundant given
the error context.

All other PL/Python patches rely on these refactorings.

Cheers,
Jan

Attachment Content-Type Size
plpython-refactor.diff text/x-patch 61.3 KB

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-01 00:00:03
Message-ID: 1293840003.5984.1.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On tor, 2010-12-23 at 14:41 +0100, Jan Urbański wrote:
> It does some architectural changes to PL/Python that make it easier to
> implement other features, like for instance a validator function. The
> full list of changes in the patch is:

I would review this and the following patches, but I'd really prefer it
if you could split this particular patch into about 11 single-purpose
patches. I think most of the changes here are not interrelated.


From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-01 12:24:01
Message-ID: 4D1F1CE1.1010005@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/01/11 01:00, Peter Eisentraut wrote:
> On tor, 2010-12-23 at 14:41 +0100, Jan Urbański wrote:
>> It does some architectural changes to PL/Python that make it easier to
>> implement other features, like for instance a validator function. The
>> full list of changes in the patch is:
>
> I would review this and the following patches, but I'd really prefer it
> if you could split this particular patch into about 11 single-purpose
> patches. I think most of the changes here are not interrelated.

OK, I'll split this patch into even smaller chunks.

Jan


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-01 12:50:28
Message-ID: 1293886228.5984.2.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On lör, 2011-01-01 at 13:24 +0100, Jan Urbański wrote:
> On 01/01/11 01:00, Peter Eisentraut wrote:
> > On tor, 2010-12-23 at 14:41 +0100, Jan Urbański wrote:
> >> It does some architectural changes to PL/Python that make it easier to
> >> implement other features, like for instance a validator function. The
> >> full list of changes in the patch is:
> >
> > I would review this and the following patches, but I'd really prefer it
> > if you could split this particular patch into about 11 single-purpose
> > patches. I think most of the changes here are not interrelated.
>
> OK, I'll split this patch into even smaller chunks.

Thanks. Just attach them all to a single mail message. Don't create
new CF entries or something.


From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-02 11:41:24
Message-ID: 4D206464.5090805@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/01/11 13:50, Peter Eisentraut wrote:
> On lör, 2011-01-01 at 13:24 +0100, Jan Urbański wrote:
>> On 01/01/11 01:00, Peter Eisentraut wrote:
>>> On tor, 2010-12-23 at 14:41 +0100, Jan Urbański wrote:
>>>> It does some architectural changes to PL/Python that make it easier to
>>>> implement other features, like for instance a validator function. The
>>>> full list of changes in the patch is:
>>>
>>> I would review this and the following patches, but I'd really prefer it
>>> if you could split this particular patch into about 11 single-purpose
>>> patches. I think most of the changes here are not interrelated.
>>
>> OK, I'll split this patch into even smaller chunks.
>
> Thanks. Just attach them all to a single mail message. Don't create
> new CF entries or something.

Here they are. There are 16 patches in total. They amount to what's
currently in my refactor branch (and almost to what I've sent as the
complete refactor patch, while doing the splitting I discovered a few
useless hunks that I've discarded).

Cheers,
Jan

Attachment Content-Type Size
0001-Use-HTABs-instead-of-Python-dictionary-objects-to-ca.patch text/x-patch 0 bytes

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-17 19:49:45
Message-ID: 1295293785.12898.7.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On sön, 2011-01-02 at 12:41 +0100, Jan Urbański wrote:
> Here they are. There are 16 patches in total. They amount to what's
> currently in my refactor branch (and almost to what I've sent as the
> complete refactor patch, while doing the splitting I discovered a few
> useless hunks that I've discarded).

Committed 0001, rest later.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-18 22:22:50
Message-ID: 1295389370.25598.4.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On mån, 2011-01-17 at 21:49 +0200, Peter Eisentraut wrote:
> On sön, 2011-01-02 at 12:41 +0100, Jan Urbański wrote:
> > Here they are. There are 16 patches in total. They amount to what's
> > currently in my refactor branch (and almost to what I've sent as the
> > complete refactor patch, while doing the splitting I discovered a few
> > useless hunks that I've discarded).
>
> Committed 0001, rest later.

Today committed: 3, 5, 10, 11, 12, 13

Discussion:

#2: It looks like this loses some information/formatting in the error
message. Should we keep the pointing arrow there?

--- a/src/pl/plpython/expected/plpython_error.out
+++ b/src/pl/plpython/expected/plpython_error.out
@@ -10,10 +10,7 @@ CREATE FUNCTION sql_syntax_error() RETURNS text
SELECT sql_syntax_error();
WARNING: PL/Python: plpy.SPIError: unrecognized error in PLy_spi_execute_query
CONTEXT: PL/Python function "sql_syntax_error"
-ERROR: syntax error at or near "syntax"
-LINE 1: syntax error
- ^
-QUERY: syntax error
+ERROR: PL/Python: plpy.SPIError: syntax error at or near "syntax"
CONTEXT: PL/Python function "sql_syntax_error"
/* check the handling of uncaught python exceptions
*/

#7: This is unnecessary because the remaining fields are automatically
initialized with NULL. The Python documentation uses initializations
like that. The way I understand it, newer Python versions might add
more fields at the end, and they will rely on the fact that they get
automatically initialized even if the source code was for an older
version. So I would rather not touch this, as it doesn't buy anything.

#16: This is probably pointless because pgindent will reformat this.


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Jan Urbański <wulczer(at)wulczer(dot)org>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-19 01:06:15
Message-ID: AANLkTin9rKCOaNH6x_ARYRL4PBys6F4J7YAbNwfbJbJX@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/1/19 Peter Eisentraut <peter_e(at)gmx(dot)net>:
> On mån, 2011-01-17 at 21:49 +0200, Peter Eisentraut wrote:
>> On sön, 2011-01-02 at 12:41 +0100, Jan Urbański wrote:
>> > Here they are. There are 16 patches in total. They amount to what's
>> > currently in my refactor branch (and almost to what I've sent as the
>> > complete refactor patch, while doing the splitting I discovered a few
>> > useless hunks that I've discarded).
>>
>> Committed 0001, rest later.
>
> Today committed: 3, 5, 10, 11, 12, 13

I have reviewed this original patches for myself as the fundamental of
subsequent work, and have comments.

- This is not in the patch, but around line 184 "vis versa" in comment
seems like typo.
- -1 is used as the initial value of PLyTypeInfo.is_rowtype. Why not 0?
- PLy_(input|output)_tuple_funcs() in PLy_trigger_handler() is added.
The comment says it should check for the possibility that the
relation's tupdesc changed since last call. Is it true that tupdesc
may change even in one statement? And it also says the two functions
are responsible for not doing repetitive work, but ISTM they are not
doing something to stop redundant work. The function doesn't seem like
lightweight enough to be called on each row.
- There are elog() and PLy_elog() overall, but it looks like to me
that the usage is inconsistent. Moreover, elog(ERROR, "PLyTypeInfo
struct is initialized for a Datum"); in
PLy_(input|output)_tuple_funcs() should be Assert() not elog()?
- A line break should be added before PLy_add_exception() after "static void"
- This is also not in the patch, but the comment
/* these should only be called once at the first call
* of plpython_call_handler. initialize the python interpreter
* and global data.
*/
is bogus. PLy_init_interp() is called in _PG_init().

That's all for now. Some of them must be my misunderstanding or
trivial, but I post them for the record.

Regards,

--
Hitoshi Harada


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Jan Urbański <wulczer(at)wulczer(dot)org>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-19 03:52:16
Message-ID: 1295409077-sup-239@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Peter Eisentraut's message of mar ene 18 19:22:50 -0300 2011:

> #16: This is probably pointless because pgindent will reformat this.

pgindent used to remove useless braces around single-statement blocks,
but this behavior was removed years ago because it'd break formatting
around PG_TRY blocks.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-19 09:57:34
Message-ID: 4D36B58E.5010307@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 18/01/11 23:22, Peter Eisentraut wrote:
> On mån, 2011-01-17 at 21:49 +0200, Peter Eisentraut wrote:
>> On sön, 2011-01-02 at 12:41 +0100, Jan Urbański wrote:
>>> Here they are. There are 16 patches in total. They amount to what's
>>> currently in my refactor branch (and almost to what I've sent as the
>>> complete refactor patch, while doing the splitting I discovered a few
>>> useless hunks that I've discarded).
>>
>> Committed 0001, rest later.
>
> Today committed: 3, 5, 10, 11, 12, 13

Thanks,

> #2: It looks like this loses some information/formatting in the error
> message. Should we keep the pointing arrow there?
>
> --- a/src/pl/plpython/expected/plpython_error.out
> +++ b/src/pl/plpython/expected/plpython_error.out
> @@ -10,10 +10,7 @@ CREATE FUNCTION sql_syntax_error() RETURNS text
> SELECT sql_syntax_error();
> WARNING: PL/Python: plpy.SPIError: unrecognized error in PLy_spi_execute_query
> CONTEXT: PL/Python function "sql_syntax_error"
> -ERROR: syntax error at or near "syntax"
> -LINE 1: syntax error
> - ^
> -QUERY: syntax error
> +ERROR: PL/Python: plpy.SPIError: syntax error at or near "syntax"
> CONTEXT: PL/Python function "sql_syntax_error"
> /* check the handling of uncaught python exceptions
> */

Yes, the message is less informative, because the error is reported by
PL/Python, by wrapping the SPI message. I guess I could try to extract
more info from the caught ErrorData and put it in the new error that
PL/Python throws. But I'd like to do that as a refinement of the
spi-in-subxacts patch, because the current behaviour is broken anyway -
to catch the SPI error safely you need to wrap the whole thing in a
subtransaction. This patch only gets rid of the global error state, and
to do that I needed to raise some exception from the try/catch block
around SPI calls.

> #7: This is unnecessary because the remaining fields are automatically
> initialized with NULL. The Python documentation uses initializations
> like that. The way I understand it, newer Python versions might add
> more fields at the end, and they will rely on the fact that they get
> automatically initialized even if the source code was for an older
> version. So I would rather not touch this, as it doesn't buy anything.

OK.

> #16: This is probably pointless because pgindent will reformat this.

If it does, than it's a shame. Maybe the comment should be moved above
the if() then.

Cheers,
Jan


From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-19 10:09:56
Message-ID: 4D36B874.1050805@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 19/01/11 02:06, Hitoshi Harada wrote:
> 2011/1/19 Peter Eisentraut <peter_e(at)gmx(dot)net>:
>> On mån, 2011-01-17 at 21:49 +0200, Peter Eisentraut wrote:
>>> On sön, 2011-01-02 at 12:41 +0100, Jan Urbański wrote:
>>>> Here they are. There are 16 patches in total. They amount to what's
>>>> currently in my refactor branch (and almost to what I've sent as the
>>>> complete refactor patch, while doing the splitting I discovered a few
>>>> useless hunks that I've discarded).
>>>
>>> Committed 0001, rest later.
>>
>> Today committed: 3, 5, 10, 11, 12, 13
>
> I have reviewed this original patches for myself as the fundamental of
> subsequent work, and have comments.
>
> - This is not in the patch, but around line 184 "vis versa" in comment
> seems like typo.

Right, should certainly be "vice versa".

> - -1 is used as the initial value of PLyTypeInfo.is_rowtype. Why not 0?

See the comments in struct PLyTypeInfo:

is_rowtype can be: -1 = not known yet (initial state); 0 = scalar
datatype; 1 = rowtype; 2 = rowtype, but I/O functions not set up yet

> - PLy_(input|output)_tuple_funcs() in PLy_trigger_handler() is added.
> The comment says it should check for the possibility that the
> relation's tupdesc changed since last call. Is it true that tupdesc
> may change even in one statement? And it also says the two functions
> are responsible for not doing repetitive work, but ISTM they are not
> doing something to stop redundant work. The function doesn't seem like
> lightweight enough to be called on each row.

Hm, you may be right. I haven't touched that part of the code, but now
it seems to me that for triggers we do the I/O funcs lookup for every
row. I could try to check that and fix it, but not sure if I'll have the
time, and it's been that way before. Also, the CF is already closed in
theory...

> - There are elog() and PLy_elog() overall, but it looks like to me
> that the usage is inconsistent. Moreover, elog(ERROR, "PLyTypeInfo
> struct is initialized for a Datum"); in
> PLy_(input|output)_tuple_funcs() should be Assert() not elog()?

Well in theory PLy_elog should be used if there's a current Python
exception set that you'd like to forward to Postgres, making it a
elog(ERROR). It's true that the usage is sometimes a bit inconsistent,
some of these inconsistencies are cleaned up in the next patches, but
probably not all of them. As for the Assert/elog, I would prefer en
elog, because if there are bugs in that code, using the wrong I/O
functions could lead to unexpected results i production (where an Assert
would not be present).

> - A line break should be added before PLy_add_exception() after "static void"

Oops, you're right.

> - This is also not in the patch, but the comment
> /* these should only be called once at the first call
> * of plpython_call_handler. initialize the python interpreter
> * and global data.
> */
> is bogus. PLy_init_interp() is called in _PG_init().

Yep, that comment looks misguided.

> That's all for now. Some of them must be my misunderstanding or
> trivial, but I post them for the record.

Thanks, your feedback it's very valuable!

Cheers,
Jan


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-19 10:56:29
Message-ID: AANLkTi=f4P5ny+xxz5JS0QNomjKwFxd6EkkgAdUTAsva@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/1/19 Jan Urbański <wulczer(at)wulczer(dot)org>:
> On 19/01/11 02:06, Hitoshi Harada wrote:
>> - -1 is used as the initial value of PLyTypeInfo.is_rowtype. Why not 0?
>
> See the comments in struct PLyTypeInfo:
>
> is_rowtype can be: -1 = not known yet (initial state); 0 = scalar
> datatype; 1 = rowtype; 2 = rowtype, but I/O functions not set up yet

Well, I mean that 0 = inital, 1 = scalar, 2 = rowtype, 3 = rowtype but
not set up sounds more intuitive to me. Of course this is only what I
feel.

Regards,

--
Hitoshi Harada


From: David Fetter <david(at)fetter(dot)org>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-19 15:35:22
Message-ID: 20110119153522.GF10624@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 19, 2011 at 11:09:56AM +0100, Jan Urbański wrote:
> On 19/01/11 02:06, Hitoshi Harada wrote:
> > 2011/1/19 Peter Eisentraut <peter_e(at)gmx(dot)net>:
> >> On mån, 2011-01-17 at 21:49 +0200, Peter Eisentraut wrote:
> >>> On sön, 2011-01-02 at 12:41 +0100, Jan Urbański wrote:
> >>>> Here they are. There are 16 patches in total. They amount to what's
> >>>> currently in my refactor branch (and almost to what I've sent as the
> >>>> complete refactor patch, while doing the splitting I discovered a few
> >>>> useless hunks that I've discarded).
> >>>
> >>> Committed 0001, rest later.
> >>
> >> Today committed: 3, 5, 10, 11, 12, 13
> >
> > I have reviewed this original patches for myself as the fundamental of
> > subsequent work, and have comments.
> >
> > - This is not in the patch, but around line 184 "vis versa" in comment
> > seems like typo.
>
> Right, should certainly be "vice versa".
>
> > - -1 is used as the initial value of PLyTypeInfo.is_rowtype. Why not 0?
>
> See the comments in struct PLyTypeInfo:
>
> is_rowtype can be: -1 = not known yet (initial state); 0 = scalar
> datatype; 1 = rowtype; 2 = rowtype, but I/O functions not set up yet
>
> > - PLy_(input|output)_tuple_funcs() in PLy_trigger_handler() is added.
> > The comment says it should check for the possibility that the
> > relation's tupdesc changed since last call. Is it true that tupdesc
> > may change even in one statement? And it also says the two functions
> > are responsible for not doing repetitive work, but ISTM they are not
> > doing something to stop redundant work. The function doesn't seem like
> > lightweight enough to be called on each row.
>
> Hm, you may be right. I haven't touched that part of the code, but now
> it seems to me that for triggers we do the I/O funcs lookup for every
> row. I could try to check that and fix it, but not sure if I'll have the
> time, and it's been that way before. Also, the CF is already closed in
> theory...

If you're fixing things in PL/PythonU, such a change would certainly
be in scope as an update to your patch, i.e. don't let the fact that
the CF has started stop you from fixing it.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: David Fetter <david(at)fetter(dot)org>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-19 16:07:19
Message-ID: 4D370C37.8030904@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 19/01/11 16:35, David Fetter wrote:
> On Wed, Jan 19, 2011 at 11:09:56AM +0100, Jan Urbański wrote:
>> On 19/01/11 02:06, Hitoshi Harada wrote:
>>> - PLy_(input|output)_tuple_funcs() in PLy_trigger_handler() is added.
>>> The comment says it should check for the possibility that the
>>> relation's tupdesc changed since last call. Is it true that tupdesc
>>> may change even in one statement? And it also says the two functions
>>> are responsible for not doing repetitive work, but ISTM they are not
>>> doing something to stop redundant work. The function doesn't seem like
>>> lightweight enough to be called on each row.
>>
>> Hm, you may be right. I haven't touched that part of the code, but now
>> it seems to me that for triggers we do the I/O funcs lookup for every
>> row. I could try to check that and fix it, but not sure if I'll have the
>> time, and it's been that way before. Also, the CF is already closed in
>> theory...

I looked again and it seems that PLy_(input|output)_tuple_funcs does
indeed avoid repetitive work. Observe that in the loop going over the
TupleDesc attributes there's

if (arg->in.r.atts[i].typoid == desc->attrs[i]->atttypid)
continue; /* already set up this entry */

which avoids the syscache lookup and calling PLy_input_datum_func2.

Cheers,
Jan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Jan Urbański <wulczer(at)wulczer(dot)org>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-19 17:20:00
Message-ID: 21644.1295457600@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Excerpts from Peter Eisentraut's message of mar ene 18 19:22:50 -0300 2011:
>> #16: This is probably pointless because pgindent will reformat this.

> pgindent used to remove useless braces around single-statement blocks,
> but this behavior was removed years ago because it'd break formatting
> around PG_TRY blocks.

Yeah. FWIW, I concur with Jan that this is a readability improvement.
A comment and a statement always look like two statements to me ---
and the fact that pgindent insists on inserting a blank line in front
of the comment in this scenario does even more to mangle the visual
impression of what the "if" controls. +1 for the braces.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Jan Urbański <wulczer(at)wulczer(dot)org>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-19 19:59:10
Message-ID: 1295467150.21153.0.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On ons, 2011-01-19 at 00:52 -0300, Alvaro Herrera wrote:
> Excerpts from Peter Eisentraut's message of mar ene 18 19:22:50 -0300 2011:
>
> > #16: This is probably pointless because pgindent will reformat this.
>
> pgindent used to remove useless braces around single-statement blocks,
> but this behavior was removed years ago because it'd break formatting
> around PG_TRY blocks.

Good to know. Committed then.


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Jan Urbański <wulczer(at)wulczer(dot)org>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-19 21:21:19
Message-ID: AANLkTimd7xYdy3yU+ehMa9CTPS=dPXJjPCH-gVaHzxSc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 19, 2011 at 2:59 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On ons, 2011-01-19 at 00:52 -0300, Alvaro Herrera wrote:
>> Excerpts from Peter Eisentraut's message of mar ene 18 19:22:50 -0300 2011:
>>
>> > #16: This is probably pointless because pgindent will reformat this.
>>
>> pgindent used to remove useless braces around single-statement blocks,
>> but this behavior was removed years ago because it'd break formatting
>> around PG_TRY blocks.
>
> Good to know.  Committed then.

I cracked up upon reading your commit message.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-20 00:26:23
Message-ID: 4D37812F.1030905@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 19/01/11 10:57, Jan Urbański wrote:
> On 18/01/11 23:22, Peter Eisentraut wrote:
>> #2: It looks like this loses some information/formatting in the error
>> message. Should we keep the pointing arrow there?

>> CONTEXT: PL/Python function "sql_syntax_error"
>> -ERROR: syntax error at or near "syntax"
>> -LINE 1: syntax error
>> - ^
>> -QUERY: syntax error
>> +ERROR: PL/Python: plpy.SPIError: syntax error at or near "syntax"
>> CONTEXT: PL/Python function "sql_syntax_error"

> Yes, the message is less informative, because the error is reported by
> PL/Python, by wrapping the SPI message. I guess I could try to extract
> more info from the caught ErrorData and put it in the new error that
> PL/Python throws.

All right, I found a way to shoehorn the extra information into
SPIException, I'll post a new patch series with what's left of the
general refactoring patch soon.

Shortly after, I'll post updated patches for doing SPI in subxacts,
explicit subxact starting and generated SPI exceptions, as they will
surely be broken by these changes.

Jan


From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-20 02:16:13
Message-ID: 4D379AED.1020000@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 20/01/11 01:26, Jan Urbański wrote:
> On 19/01/11 10:57, Jan Urbański wrote:
>> On 18/01/11 23:22, Peter Eisentraut wrote:
>>> #2: It looks like this loses some information/formatting in the error
>>> message. Should we keep the pointing arrow there?
>
>>> CONTEXT: PL/Python function "sql_syntax_error"
>>> -ERROR: syntax error at or near "syntax"
>>> -LINE 1: syntax error
>>> - ^
>>> -QUERY: syntax error
>>> +ERROR: PL/Python: plpy.SPIError: syntax error at or near "syntax"
>>> CONTEXT: PL/Python function "sql_syntax_error"
>
>> Yes, the message is less informative, because the error is reported by
>> PL/Python, by wrapping the SPI message. I guess I could try to extract
>> more info from the caught ErrorData and put it in the new error that
>> PL/Python throws.
>
> All right, I found a way to shoehorn the extra information into
> SPIException, I'll post a new patch series with what's left of the
> general refactoring patch soon.

Here's an updated patch series for PL/Python refactoring. It was 16
patches at first, 8 are committed, 1 got dropped, so we're down to 7.

Cheers,
Jan

Attachment Content-Type Size
0007-Do-not-prefix-error-messages-with-the-string-PL-Pyth.patch text/x-patch 0 bytes

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: Jan Urbański <wulczer(at)wulczer(dot)org>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-20 20:38:33
Message-ID: 1295555913.27153.2.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On ons, 2011-01-19 at 10:06 +0900, Hitoshi Harada wrote:
> - This is not in the patch, but around line 184 "vis versa" in comment
> seems like typo.

Fixed.

> - A line break should be added before PLy_add_exception() after "static void"

I'll add that when I get to the patch.

> - This is also not in the patch, but the comment
> /* these should only be called once at the first call
> * of plpython_call_handler. initialize the python interpreter
> * and global data.
> */
> is bogus. PLy_init_interp() is called in _PG_init().

Fixed.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-22 20:53:41
Message-ID: 1295729621.1770.3.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On tor, 2011-01-20 at 03:16 +0100, Jan Urbański wrote:
> Here's an updated patch series for PL/Python refactoring. It was 16
> patches at first, 8 are committed, 1 got dropped, so we're down to 7.

> Refactor PLy_spi_prepare to save two levels of indentation.
>
> Instead of checking if the arglist is NULL and then if its length is
> 0, do it in one step, and outside of the try/catch block.

Why is it a good idea to do the PLy_malloc calls outside of the
try/catch block? Also, why call them when nargs is 0?


From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-22 22:10:56
Message-ID: 4D3B55F0.9030803@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 22/01/11 21:53, Peter Eisentraut wrote:
> On tor, 2011-01-20 at 03:16 +0100, Jan Urbański wrote:
>> Here's an updated patch series for PL/Python refactoring. It was 16
>> patches at first, 8 are committed, 1 got dropped, so we're down to 7.
>
>> Refactor PLy_spi_prepare to save two levels of indentation.
>>
>> Instead of checking if the arglist is NULL and then if its length is
>> 0, do it in one step, and outside of the try/catch block.
>
> Why is it a good idea to do the PLy_malloc calls outside of the
> try/catch block? Also, why call them when nargs is 0?

I think it's better to call them outside of the try/catch, because if
palloc failed, we'de be better off just interrupting the function
execution and raising an error than catching the longjmp and turning it
into a Python exception, which will probably make Python fail with a
MemoryError really soon. And I've even seen Python segfaulting when it
ran out of memory instead of raising a MemoryError exception.

As for the nargs == 0 case, you're right, it should read

plan->types = nargs ? PLy_malloc(sizeof(Oid) * nargs) : NULL;

especially since PLy_plan_dealloc does

if (ob->types)
PLy_free(ob->types)

Cheers,
Jan


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-26 23:40:43
Message-ID: 1296085243.14137.13.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On tor, 2011-01-20 at 03:16 +0100, Jan Urbański wrote:
> Here's an updated patch series for PL/Python refactoring. It was 16
> patches at first, 8 are committed, 1 got dropped, so we're down to 7.

Everything(*) is now committed.

In 0006-Improve-exception-usage-in-PL-Python.patch I went for TypeError
instead of ValueError because that matched better with the behavior of
some Python built-ins. Same idea, though.

(*) In 0007-Do-not-prefix-error-messages-with-the-string-PL-Pyth.patch,
I did not commit the bit that moved pg_verifymbstr outside the TRY
block. This is debatable. I observe that there are other uses of
pg_verifymbstr in TRY blocks. Also, we document that strings in Python
code must be in the server encoding, so I would argue that this error
could be considered a Python error and thus the current code would be
correct.


From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python refactoring
Date: 2011-01-26 23:43:53
Message-ID: 4D40B1B9.2090508@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27/01/11 00:40, Peter Eisentraut wrote:
> On tor, 2011-01-20 at 03:16 +0100, Jan Urbański wrote:
>> Here's an updated patch series for PL/Python refactoring. It was 16
>> patches at first, 8 are committed, 1 got dropped, so we're down to 7.
>
> Everything(*) is now committed.

Great, thanks.

I'll be posting updated versions of the remaining patches to their
corresponding threads (or their review threads, if they already exist).

Jan