Re: pl/python custom datatype parsers

Lists: pgsql-hackers
From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: pl/python custom datatype parsers
Date: 2010-12-23 14:15:08
Message-ID: 4D13596C.3020204@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Here's a patch implementing custom parsers for data types mentioned in
http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's
an incremental patch on top of the plpython-refactor patch sent eariler.

Git branch for this patch:
https://github.com/wulczer/postgres/tree/custom-parsers.

The idea has been discussed in
http://archives.postgresql.org/pgsql-hackers/2010-12/msg01307.php.

With that patch, when built with --with-python, the hstore module
includes code that adds a GUC called plpython.hstore.

This GUC should be set to the full name of the hstore datatype, for
instance plpython.hstore = 'public.hstore'.

If it is set, the datatype's OID is looked up and hstore sets up a
rendezvous variable called PLPYTHON_<OID>_PARSERS that points to two
functions that can convert a hstore Datum to a PyObject and back.

PL/Python ot the other hand when it sees an argument with an unknown
type tries to look up a rendezvous variable using the type's OID and if
it finds it, it uses the parser functions pointed at by that variable.

Long story short, it works so:

LOAD 'hstore';
SET plpython.hstore = 'public.hstore'
CREATE FUNCTION pick_one(h hstore, key text) RETURNS hstore AS $$ return
{key: h[key]} $$ LANGUAGE plpythonu;
SELECT pick_one('a=>3,b=>4', 'b')
-- gives bask a hstore 'b=>4'

There's some ugliness with how hstore's Makefile handles building it,
and I'm not sure what's needed to make it work with the Windows build
system. Also, documentation is missing. It's already usable, but if we
decide to commit that, I'll probably need some help with Windows and docs.

I first tried to make hstore generate a separate .so with that
functionality if --with-python was specified, but couldn't convince the
Makefile to do that. So if you configure the tree with --with-python,
hstore will link to libpython, maybe that's OK?

Cheers,
Jan

PS: of course, once committed we can add custom parsers for isbn,
citext, uuids, cubes, and other weird things.

J

Attachment Content-Type Size
plpython-custom-parsers.diff text/x-patch 14.7 KB

From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python custom datatype parsers
Date: 2011-01-27 22:03:37
Message-ID: 4D41EBB9.2030106@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 23/12/10 15:15, Jan Urbański wrote:
> Here's a patch implementing custom parsers for data types mentioned in
> http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's
> an incremental patch on top of the plpython-refactor patch sent eariler.

Updated to master.

Attachment Content-Type Size
plpython-custom-parsers.diff text/x-patch 14.7 KB

From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python custom datatype parsers
Date: 2011-02-04 16:19:05
Message-ID: AANLkTikx5B0CO7nytE+cghoQej3w2V1pNPzX8k-C-o8M@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/1/28 Jan Urbański <wulczer(at)wulczer(dot)org>:
> On 23/12/10 15:15, Jan Urbański wrote:
>> Here's a patch implementing custom parsers for data types mentioned in
>> http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's
>> an incremental patch on top of the plpython-refactor patch sent eariler.
>
> Updated to master.

I reviewed this for some time today.

The patch applies with hunks, compiles and tests are passed, though it
looks like not having additional test along with it.

- in hstore_plpython.c,
PLyParsers parsers = {
.in = hstore_to_dict,
.out = dict_to_hstore
};
I'm not sure if this coding style is used anywhere in the core.
Isn't this the C99 style?

- You need define custom variable class to use this feature.
plpython.hstore = 'public.hstore'. I wonder why it's called
plpython[u].hstore = 'public.hstore' (with 'u') because the language
is called "plpythonu".

- typo in plpython.h,
Types for parsres functions that ...

- I tried the sample you mention upthread,
regression=# select pick_one('a=>3, b=>4', 'b');
ERROR: TypeError: string indices must be integers
CONTEXT: PL/Python function "pick_one"

My python is 2.4.3 again.

That's it for now. It is an exciting feature and plpython will be the
first language to think of when you're building "object database" if
this feature is in. The design here will affect following pl/perl and
other so it is important enough to discuss.

Regards,

--
Hitoshi Harada


From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python custom datatype parsers
Date: 2011-02-06 18:01:41
Message-ID: 4D4EE205.30403@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04/02/11 17:19, Hitoshi Harada wrote:
> 2011/1/28 Jan Urbański <wulczer(at)wulczer(dot)org>:
>> On 23/12/10 15:15, Jan Urbański wrote:
>>> Here's a patch implementing custom parsers for data types mentioned in
>>> http://archives.postgresql.org/pgsql-hackers/2010-12/msg01991.php. It's
>>> an incremental patch on top of the plpython-refactor patch sent eariler.
>>
>> Updated to master.
>
> I reviewed this for some time today.

Thank you.

> The patch applies with hunks, compiles and tests are passed, though it
> looks like not having additional test along with it.

I added a simple test. I had to add an expected file for the case when
hstore is compiled without PL/Python integration.

> - in hstore_plpython.c,
> PLyParsers parsers = {
> .in = hstore_to_dict,
> .out = dict_to_hstore
> };
> I'm not sure if this coding style is used anywhere in the core.
> Isn't this the C99 style?

Ooops, you're right. Fixed.

> - You need define custom variable class to use this feature.
> plpython.hstore = 'public.hstore'. I wonder why it's called
> plpython[u].hstore = 'public.hstore' (with 'u') because the language
> is called "plpythonu".

I think plpython.hstore was what showed up in discussion... I'd be fine
with calling the variable plpythonu.hstore, if that's the consensus.

> - typo in plpython.h,
> Types for parsres functions that ...

Fixed.

> - I tried the sample you mention upthread,
> regression=# select pick_one('a=>3, b=>4', 'b');
> ERROR: TypeError: string indices must be integers
> CONTEXT: PL/Python function "pick_one"
>
> My python is 2.4.3 again.

Hm, this means that the hstore has not been transformed into a Python
dict, but into a string, which is what happens if you *don't* have
plpython hstore integration enabled. I think that was because of an
issue with my changes to hstore's Makefile, that made it compile without
Python support, even if the sources were configured with --with-python.

There's also a gotcha: if you set plpython.hstore to 'public.hstore',
you will have to DROP (or CREATE OR REPLACE again) all functions that
accept or return hstores, because their I/O routines are already cached.
Not sure how big of a problem that is (or how to fix it in an elegant
manner). Making the parameter PGC_POSTMASTER is an easy solution... but
not very nice.

> That's it for now. It is an exciting feature and plpython will be the
> first language to think of when you're building "object database" if
> this feature is in. The design here will affect following pl/perl and
> other so it is important enough to discuss.

Yes, I ended up writing this patch as a PoC of how you can integrate
procedural languages with arbitrary addon modules, so it would be good
to have a discussion about the general mechanisms. I'm aware that this
discussion, and subsequently this patch, might be punted to 9.2
(although that would be a shame).

Cheers,
Jan

Attachment Content-Type Size
plpython-custom-parsers.diff text/x-patch 49.0 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python custom datatype parsers
Date: 2011-02-11 15:43:49
Message-ID: AANLkTinT6QPE5TefXws2Px80Au_JD7nhZimtwV4MVRY_@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Feb 6, 2011 at 1:01 PM, Jan Urbański <wulczer(at)wulczer(dot)org> wrote:
>> That's it for now. It is an exciting feature and plpython will be the
>> first language to think of when you're building "object database" if
>> this feature is in. The design here will affect following pl/perl and
>> other so it is important enough to discuss.
>
> Yes, I ended up writing this patch as a PoC of how you can integrate
> procedural languages with arbitrary addon modules, so it would be good
> to have a discussion about the general mechanisms. I'm aware that this
> discussion, and subsequently this patch, might be punted to 9.2
> (although that would be a shame).

It's not clear to me from this discussion whether this patch (a) now
works and has consensus, and should be committed, (b) still needs more
discussion, but hopes to make it into 9.1, or (c) is now 9.2 material.

Can someone please clarify?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python custom datatype parsers
Date: 2011-02-11 15:49:36
Message-ID: 4D555A90.8070906@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/02/11 16:43, Robert Haas wrote:
> On Sun, Feb 6, 2011 at 1:01 PM, Jan Urbański <wulczer(at)wulczer(dot)org> wrote:
>>> That's it for now. It is an exciting feature and plpython will be the
>>> first language to think of when you're building "object database" if
>>> this feature is in. The design here will affect following pl/perl and
>>> other so it is important enough to discuss.
>>
>> Yes, I ended up writing this patch as a PoC of how you can integrate
>> procedural languages with arbitrary addon modules, so it would be good
>> to have a discussion about the general mechanisms. I'm aware that this
>> discussion, and subsequently this patch, might be punted to 9.2
>> (although that would be a shame).
>
> It's not clear to me from this discussion whether this patch (a) now
> works and has consensus, and should be committed, (b) still needs more
> discussion, but hopes to make it into 9.1, or (c) is now 9.2 material.

I believe it's (b). But as we don't have time for that discussion that
late in the release cycle, I think we need to consider it identical to (c).

Cheers,
Jan


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python custom datatype parsers
Date: 2011-02-11 15:57:58
Message-ID: AANLkTikOcPYTp9tH6RLzRkew2p4ijS8VwZVCjWmAuON0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 11, 2011 at 10:49 AM, Jan Urbański <wulczer(at)wulczer(dot)org> wrote:
> On 11/02/11 16:43, Robert Haas wrote:
>> On Sun, Feb 6, 2011 at 1:01 PM, Jan Urbański <wulczer(at)wulczer(dot)org> wrote:
>>>> That's it for now. It is an exciting feature and plpython will be the
>>>> first language to think of when you're building "object database" if
>>>> this feature is in. The design here will affect following pl/perl and
>>>> other so it is important enough to discuss.
>>>
>>> Yes, I ended up writing this patch as a PoC of how you can integrate
>>> procedural languages with arbitrary addon modules, so it would be good
>>> to have a discussion about the general mechanisms. I'm aware that this
>>> discussion, and subsequently this patch, might be punted to 9.2
>>> (although that would be a shame).
>>
>> It's not clear to me from this discussion whether this patch (a) now
>> works and has consensus, and should be committed, (b) still needs more
>> discussion, but hopes to make it into 9.1, or (c) is now 9.2 material.
>
> I believe it's (b). But as we don't have time for that discussion that
> late in the release cycle, I think we need to consider it identical to (c).

OK, I'll mark it Returned with Feedback.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python custom datatype parsers
Date: 2011-03-01 16:50:53
Message-ID: 1298998253.30816.3.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On fre, 2011-02-11 at 16:49 +0100, Jan Urbański wrote:
> I believe it's (b). But as we don't have time for that discussion that
> late in the release cycle, I think we need to consider it identical to (c).

As I previously mentioned, I think that there should be an SQL-level way
to tie together languages and types. I previously mentioned the
SQL-standard command CREATE TRANSFORM as a possibility. I've had this
on my PL/Python TOTHINK list for a while. Thankfully you removed all
the items ahead of this one, so I'll think of something to do in 9.2.

Of course we'll be able to use the actual transform code that you
already wrote.


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Jan Urbański <wulczer(at)wulczer(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python custom datatype parsers
Date: 2011-11-08 21:08:07
Message-ID: 4EB99A37.6040007@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/01/2011 11:50 AM, Peter Eisentraut wrote:
> On fre, 2011-02-11 at 16:49 +0100, Jan Urbański wrote:
>> I believe it's (b). But as we don't have time for that discussion that
>> late in the release cycle, I think we need to consider it identical to (c).
> As I previously mentioned, I think that there should be an SQL-level way
> to tie together languages and types. I previously mentioned the
> SQL-standard command CREATE TRANSFORM as a possibility. I've had this
> on my PL/Python TOTHINK list for a while. Thankfully you removed all
> the items ahead of this one, so I'll think of something to do in 9.2.
>
> Of course we'll be able to use the actual transform code that you
> already wrote.
>

Peter,

Did you make any progress on this?

cheers

andrew


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Jan Urbański <wulczer(at)wulczer(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python custom datatype parsers
Date: 2011-11-10 20:13:52
Message-ID: 1320956032.20692.7.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On tis, 2011-11-08 at 16:08 -0500, Andrew Dunstan wrote:
>
> On 03/01/2011 11:50 AM, Peter Eisentraut wrote:
> > On fre, 2011-02-11 at 16:49 +0100, Jan Urbański wrote:
> >> I believe it's (b). But as we don't have time for that discussion that
> >> late in the release cycle, I think we need to consider it identical to (c).
> > As I previously mentioned, I think that there should be an SQL-level way
> > to tie together languages and types. I previously mentioned the
> > SQL-standard command CREATE TRANSFORM as a possibility. I've had this
> > on my PL/Python TOTHINK list for a while. Thankfully you removed all
> > the items ahead of this one, so I'll think of something to do in 9.2.
> >
> > Of course we'll be able to use the actual transform code that you
> > already wrote.
> >
>
> Peter,
>
> Did you make any progress on this?

No, but it's still somewhere on my list. I saw your blog post related
to this.

I think the first step would be to set up some catalog infrastructure
(without DDL commands and all that overhead), and try to adapt the big
"case" statement of an existing language to that, and then check whether
that works, performance, etc.

Some other concerns of the top of my head:

- Arrays: Would probably not by handled by that. So this would not be
able to handle, for example, switching the array handling behavior in
PL/Perl to ancient compatible mode.

- Range types: no idea

I might work on this, but not before December, would be my guess.


From: Hannu Krosing <hannu(at)krosing(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Jan Urbański <wulczer(at)wulczer(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python custom datatype parsers
Date: 2012-12-14 14:42:21
Message-ID: 50CB3ACD.2060504@krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Did any (committed?) code result from this thread ?

On 11/10/2011 09:13 PM, Peter Eisentraut wrote:
> On tis, 2011-11-08 at 16:08 -0500, Andrew Dunstan wrote:
>> On 03/01/2011 11:50 AM, Peter Eisentraut wrote:
>>> On fre, 2011-02-11 at 16:49 +0100, Jan Urbański wrote:
>>>> I believe it's (b). But as we don't have time for that discussion that
>>>> late in the release cycle, I think we need to consider it identical to (c).
>>> As I previously mentioned, I think that there should be an SQL-level way
>>> to tie together languages and types. I previously mentioned the
>>> SQL-standard command CREATE TRANSFORM as a possibility. I've had this
>>> on my PL/Python TOTHINK list for a while. Thankfully you removed all
>>> the items ahead of this one, so I'll think of something to do in 9.2.
>>>
>>> Of course we'll be able to use the actual transform code that you
>>> already wrote.
>>>
>> Peter,
>>
>> Did you make any progress on this?
> No, but it's still somewhere on my list. I saw your blog post related
> to this.
>
> I think the first step would be to set up some catalog infrastructure
> (without DDL commands and all that overhead), and try to adapt the big
> "case" statement of an existing language to that, and then check whether
> that works, performance, etc.
>
> Some other concerns of the top of my head:
>
> - Arrays: Would probably not by handled by that. So this would not be
> able to handle, for example, switching the array handling behavior in
> PL/Perl to ancient compatible mode.
>
> - Range types: no idea
>
> I might work on this, but not before December, would be my guess.
>
>