WIP: plpython3

Lists: pgsql-hackers
From: James Pye <lists(at)jwp(dot)name>
To: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: WIP: plpython3
Date: 2009-07-23 22:23:40
Message-ID: FF3A61B6-D1EB-4057-9071-7415FE03DBDA@jwp.name
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

http://github.com/jwp/postgresql-plpython3/tree/plpython3 [branch
name: plpython3]
[src/pl/plpython3] (Yeah, I'm going to try to move it to
git.postgresql.org soon-ish)

In a recent thread[1], Peter said:

That also means that maintaining a separate, parallel code base
for a Python 3 variant can only be acceptable if it gives major
advantages.

Here are the features that I plan/hope to implement before submitting
any patch:

* Native Typing [Python types that represent Postgres types]
* Reworked function structure (Python modules, not function fragments)
* Improved SQL interfaces (prepared statement objects[2])
* Better SRF support(?) (uses iterators, will support composites,
vpc & mat)
* Direct function calls (to other Postgres functions)
* IST support (with xact(): ...)
* Full tracebacks for Python exceptions(CONTEXT support)
* Cached bytecode (presuming a "procache" attributes patch would be
acceptable[3])

The first two features are why a new PL should be incorporated.

Native typing alone is that desirable because it allows for Postgres
type semantics to be retained inside Python. Using conversion for some
types--the existing solution in plpython--may not be desirable due to
potential inconsistencies in value. A notable example is that Python's
datetime.timedelta cannot support interval's month field. And from a
performance perspective, creating Python objects representing a
parameter is approximately the cost of allocating memory for a Python
object and datumCopy.

The second feature, function structure, is actually new to the PL.
Originally PL/Py took a pl/python-like approach to triggers and
functions. *Currently*, I want to change procedures to be Python
modules with specific entry points used to handle an event. Mere
invocation: "main". Or, a trigger event: "before_insert",
"after_insert", "before_update", etc.

So, a regular function might look like:

CREATE OR REPLACE FUNCTION foo(int) RETURNS int LANGUAGE plpython3u AS
$python$
import Postgres

def main(i):
return i
$python$;

Despite the signature repetition, this is an improvement for the user
and the developer. The user now has an explicit initialization section
that is common to Python(it's a module). The PL developer no longer
needs to munge the source, and can work with common Python APIs to
manage and introspect the procedure's module(...thinking: procedure
settings..).

A trigger function might look like:

CREATE OR REPLACE FUNCTION trig() RETURNS TRIGGER LANGUAGE plpython3u AS
$python$
import Postgres

def check(i):
...

def before_insert(new):
...

def before_update(new, old):
# The default action is for the manipulation to occur,
# so users must explicitly raise FilterEvent in order to
# stop a row from being inserted, updated, deleted.
if check(new["column_name"]):
raise StopEvent()

def after_delete(old):
...

$python$;

Thoughts? [...it still has a *long* ways to go =]

[1] http://archives.postgresql.org/pgsql-hackers/2009-05/msg01376.php
[2] http://python.projects.postgresql.org/docs/0.9/driver.html#prepared-statement-interface-points
[3] http://archives.postgresql.org/pgsql-hackers/2006-05/
msg01160.php (I think a new column would be wise)


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: James Pye <lists(at)jwp(dot)name>
Subject: Re: WIP: plpython3
Date: 2009-07-24 08:21:19
Message-ID: 200907241121.19524.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Friday 24 July 2009 01:23:40 James Pye wrote:
> Here are the features that I plan/hope to implement before submitting
> any patch:
>
> * Native Typing [Python types that represent Postgres types]
> * Reworked function structure (Python modules, not function fragments)
> * Improved SQL interfaces (prepared statement objects[2])
> * Better SRF support(?) (uses iterators, will support composites,
> vpc & mat)
> * Direct function calls (to other Postgres functions)
> * IST support (with xact(): ...)
> * Full tracebacks for Python exceptions(CONTEXT support)
> * Cached bytecode (presuming a "procache" attributes patch would be
> acceptable[3])

While various of these ideas may be good, I think you are setting yourself up
for a rejection. There is a lot of plpython code already out there, and many
years have gone into debugging plpython to work well, so rewriting everything
and setting everyone up for a flag day, or requiring the parallel maintenance
of old and new versions of plpython is not going to work. Plus, tying all of
this up with Python 3 will make totally sure that no one expect a minority
will be able to use it.

As far as I can tell, most of the features you list above could very well be
implemented in the current language handler, using separate, isolated patches.
I don't see why everything needs to be written from scratch.


From: James Pye <lists(at)jwp(dot)name>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: plpython3
Date: 2009-07-24 11:24:50
Message-ID: 4D7FDC60-AE47-4D34-B833-28705812B82D@jwp.name
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jul 24, 2009, at 1:21 AM, Peter Eisentraut wrote:
> While various of these ideas may be good, I think you are setting
> yourself up
> for a rejection.

Right, I supposed that that may be the case or at least that you would
feel this way based on your messages from the prior thread.

> There is a lot of plpython code already out there, and many
> years have gone into debugging plpython to work well, so rewriting
> everything
> and setting everyone up for a flag day, or requiring the parallel
> maintenance
> of old and new versions of plpython is not going to work.

Does this mean that you are no longer of the opinion that a separate
implementation is acceptable under the circumstances that it provides
major advantages?
Or are you of the opinion that the listed features do not provide
major advantages?
Or, perhaps, more appropriately, that the transitional features do not
provide major advantages?

[transitional features being native typing and reworked function
structure]

> As far as I can tell, most of the features you list above could very
> well be
> implemented in the current language handler, using separate,
> isolated patches.
> I don't see why everything needs to be written from scratch.

That's why I tried to highlight native typing and the reworked
function structure.
Those two features, not to mention Python 3, make it a distinct-enough
beast to justify a different code base, IMO. The rest are icing. Icing
is delicious.

I see Python 3 as a good opportunity to change the interfaces and fix
the design of the PL.

I dunno. I have time to give it some TLC, and I'm not terribly excited
about trying to tack features onto something that I find kinda gross.


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: James Pye <lists(at)jwp(dot)name>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: plpython3
Date: 2009-07-24 17:29:53
Message-ID: 1248456593.4441.9.camel@jd-laptop.pragmaticzealot.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2009-07-24 at 04:24 -0700, James Pye wrote:

> I see Python 3 as a good opportunity to change the interfaces and fix
> the design of the PL.
>
> I dunno. I have time to give it some TLC, and I'm not terribly excited
> about trying to tack features onto something that I find kinda gross.
>

If someone wants to actually take the time to create a better plpython,
I say more power to him. It is a bit unfortunate that it is tied
explicitly to python 3 but I can see advantages to that as well.

Joshua D. Drake

--
PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org
Consulting, Development, Support, Training
503-667-4564 - http://www.commandprompt.com/
The PostgreSQL Company, serving since 1997


From: Stuart Bishop <stuart(at)stuartbishop(dot)net>
To: James Pye <lists(at)jwp(dot)name>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: plpython3
Date: 2009-07-25 02:08:28
Message-ID: fxjpchgh3hz8cq3yhzUYAxe124vaj_firegpg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jul 24, 2009 at 5:23 AM, James Pye<lists(at)jwp(dot)name> wrote:

>   That also means that maintaining a separate, parallel code base
>   for a Python 3 variant can only be acceptable if it gives major
> advantages.

I'm not particularly interested in Python 3.x support yet (we are still back on 2.4, soon to hop to 2.5 or 2.6. For us 3.1 is probably 2 years away at the earliest). I am interested in improved plpython though.

>  * Reworked function structure (Python modules, not function fragments)

I think it would be an improvement to move away from function fragments. One thing I would like to be able to do is have my Python test suite import my plpython and run tests on it. This would be much easier to do if instead of 'import Postgres' to pull in the api, an object was passed into the entry point which provides the interface to PostgreSQL. This way I can pass in a mock object. This is also useful outside of the test suite - the same module can be used as a stored procedure or by your Python application - your web application can use the same validators as your check constraints for instance.

> The second feature, function structure, is actually new to the PL.
> Originally PL/Py took a pl/python-like approach to triggers and functions.
> *Currently*, I want to change procedures to be Python modules with specific
> entry points used to handle an event. Mere invocation: "main". Or, a trigger
> event: "before_insert", "after_insert", "before_update", etc.

> So, a regular function might look like:
>
> CREATE OR REPLACE FUNCTION foo(int) RETURNS int LANGUAGE plpython3u AS
> $python$
> import Postgres
>
> def main(i):
>    return i
> $python$;
>
> Despite the signature repetition, this is an improvement for the user and
> the developer. The user now has an explicit initialization section that is
> common to Python(it's a module). The PL developer no longer needs to munge
> the source, and can work with common Python APIs to manage and introspect
> the procedure's module(...thinking: procedure settings..).

I'd like a way to avoid initialization on module import if possible. Calling an initialization function after module import, if it exists, would do this.

CREATE FUNCTION foo(int) RETURNS in LANGUAGE plpythonu AS
$python$
[initialization on module import]
def pg_init(pg):
[initialization after module import]
def pg_main(pg, i):
return i
$python$;

> Thoughts? [...it still has a *long* ways to go =]

I tend to dislike magic function names, but perhaps it is the most usable solution.

--
Stuart Bishop <stuart(at)stuartbishop(dot)net>
http://www.stuartbishop.net/


From: James Pye <lists(at)jwp(dot)name>
To: Stuart Bishop <stuart(at)stuartbishop(dot)net>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: plpython3
Date: 2009-07-25 03:31:15
Message-ID: DB936DFC-66AD-4F91-9868-3B89D9AD323F@jwp.name
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jul 24, 2009, at 7:08 PM, Stuart Bishop wrote:
> I'm not particularly interested in Python 3.x support yet (we are
> still back on 2.4, soon to hop to 2.5 or 2.6. For us 3.1 is probably
> 2 years away at the earliest). I am interested in improved plpython
> though.

Two years would hopefully be enough time to work out most of the new
bugs. =)

> This way I can pass in a mock object. This is also useful outside of
> the test suite - the same module can be used as a stored procedure
> or by your Python application - your web application can use the
> same validators as your check constraints for instance.

Hmm.

import sys
sys.modules["Postgres"] = mock_pg_module

Would that not suffice?

> I'd like a way to avoid initialization on module import if possible.
> Calling an initialization function after module import, if it
> exists, would do this.
>
> CREATE FUNCTION foo(int) RETURNS in LANGUAGE plpythonu AS
> $python$
> [initialization on module import]
> def pg_init(pg):
> [initialization after module import]
> def pg_main(pg, i):
> return i
> $python$;

I do like this idea. However, it may already be possible under the
current design with some explicit main() management:

CREATE ...
$python$
import Postgres

def usual(*args):
...

def init(*args):
global main
...
main = usual
return usual(*args)

main = init
$python$;

Perhaps ugly, but I imagine a construct could be created to clean it up:

CREATE ...
$python$
import Postgres

def usual(*args):
...

def init(*args):
...
return usual(*args)

main = call_once_then(init, lambda: globals()['main'] = usual)
$python$;

Hmm, still ugly tho, no?

Well, the above examples aren't actually consistent with your design,
but perhaps it achieves the desired result?

> I tend to dislike magic function names, but perhaps it is the most
> usable solution.

Indeed.