pl/python long-lived allocations in datum->dict transformation

Lists: pgsql-hackers
From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: pl/python long-lived allocations in datum->dict transformation
Date: 2012-02-05 18:54:11
Message-ID: 4F2ED053.1010904@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Consider this:

create table arrays as select array[random(), random(), random(),
random(), random(), random()] as a from generate_series(1, 1000000);

create or replace function plpython_outputfunc() returns void as $$
c = plpy.cursor('select a from arrays')
for row in c:
pass
$$ language plpythonu;

When running the function, every datum will get transformed into a
Python dict, which includes calling the type's output function,
resulting in a memory allocation. The memory is allocated in the SPI
context, so it accumulates until the function is finished.

This is annoying for functions that plough through large tables, doing
some calculation. Attached is a patch that does the conversion of
PostgreSQL Datums into Python dict objects in a scratch memory context
that gets reset every time.

Cheers,
Jan

Attachment Content-Type Size
plpython-tuple-to-dict-leak.patch text/x-diff 4.2 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python long-lived allocations in datum->dict transformation
Date: 2012-02-11 23:48:23
Message-ID: 24229.1329004103@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <wulczer(at)wulczer(dot)org> writes:
> This is annoying for functions that plough through large tables, doing
> some calculation. Attached is a patch that does the conversion of
> PostgreSQL Datums into Python dict objects in a scratch memory context
> that gets reset every time.

As best I can tell, this patch proposes creating a new, separate context
(chewing up 8KB+) for every plpython procedure that's ever used in a
given session. This cure could easily be worse than the disease as far
as total space consumption is concerned. What's more, it's unclear that
it won't malfunction altogether if the function is used recursively
(ie, what if PLyDict_FromTuple ends up calling the same function again?)
Can't you fix it so that the temp context is associated with a
particular function execution, rather than being "statically" allocated
per-function?

regards, tom lane


From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python long-lived allocations in datum->dict transformation
Date: 2012-02-13 22:40:55
Message-ID: 4F399177.9040200@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/02/12 00:48, Tom Lane wrote:
> =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <wulczer(at)wulczer(dot)org> writes:
>> This is annoying for functions that plough through large tables, doing
>> some calculation. Attached is a patch that does the conversion of
>> PostgreSQL Datums into Python dict objects in a scratch memory context
>> that gets reset every time.
>
> As best I can tell, this patch proposes creating a new, separate context
> (chewing up 8KB+) for every plpython procedure that's ever used in a
> given session. This cure could easily be worse than the disease as far

Yeah, that's not ideal.

> What's more, it's unclear that
> it won't malfunction altogether if the function is used recursively
> (ie, what if PLyDict_FromTuple ends up calling the same function again?)

I was a bit worried about that, but the only place where
PLyDict_FromTuple calls into some other code is when it calls the type's
specific I/O function, which AFAICT can't call back into user code
(except for user-defined C I/O routines). It's not very comfortable, but
I think PLyDict_FromTuple can be allowed to be non-reentrant.

> Can't you fix it so that the temp context is associated with a
> particular function execution, rather than being "statically" allocated
> per-function?

That would be cool, but I failed to easily get a handle on something
that's like the execution context of a PL/Python function... Actually,
if we assume that PLyDict_FromTuple (which is quite a low-level thing)
never calls PL/Python UDFs we could keep a single memory context in
top-level PL/Python memory and pay the overhead once in a session, not
once per function.

OTOH if we want to make it reentrant, some more tinkering would be in order.

Cheers,
Jan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python long-lived allocations in datum->dict transformation
Date: 2012-02-14 00:35:33
Message-ID: 24084.1329179733@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <wulczer(at)wulczer(dot)org> writes:
> On 12/02/12 00:48, Tom Lane wrote:
>> What's more, it's unclear that
>> it won't malfunction altogether if the function is used recursively
>> (ie, what if PLyDict_FromTuple ends up calling the same function again?)

> I was a bit worried about that, but the only place where
> PLyDict_FromTuple calls into some other code is when it calls the type's
> specific I/O function, which AFAICT can't call back into user code
> (except for user-defined C I/O routines). It's not very comfortable, but
> I think PLyDict_FromTuple can be allowed to be non-reentrant.

I think that's pretty short-sighted. Even if it's safe today (which
I am not 100% convinced of), there are plenty of foreseeable reasons
why it might^Wwill break in the future.

* There is no reason to think that datatype I/O functions will never
be written in anything but C. People have asked repeatedly for the
ability to write them in higher-level languages. I doubt that would
ever be possible in plpgsql, but with languages that can do
bit-twiddling like plpython or plperl, it seems possible.

* A datatype I/O function, even if written in C, could call user-written
code. See domain_in for example, which can invoke arbitrary processing
via domain constraint checking. If you were proposing to patch
PLyObject_ToTuple rather than the other direction, this patch would be
breakable today. Admittedly the breakage would require some rather
contrived coding ("your domain's constraint check function does
*what*?"), but it would still be a security bug.

* Once we have the ability to associate a temp memory context with
plpython, there will be a temptation to use it for other purposes
besides this one, and it will not be long before such a purpose does
open a recursion risk, even if there's none there today. (Speaking of
which, it sure looks to me like PLyObject_ToDatum, PLyObject_ToTuple,
etc leak memory like there's no tomorrow.)

> OTOH if we want to make it reentrant, some more tinkering would be in order.

I think that's in order.

regards, tom lane


From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python long-lived allocations in datum->dict transformation
Date: 2012-02-20 00:09:21
Message-ID: 4F418F31.1070204@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 14/02/12 01:35, Tom Lane wrote:
> =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <wulczer(at)wulczer(dot)org> writes:
>> It's not very comfortable, but
>> I think PLyDict_FromTuple can be allowed to be non-reentrant.
>
> I think that's pretty short-sighted. Even if it's safe today (which
> I am not 100% convinced of), there are plenty of foreseeable reasons
> why it might^Wwill break in the future.
>
>> OTOH if we want to make it reentrant, some more tinkering would be in order.
>
> I think that's in order.

Here are the results of the tinkering.

I came up with a stack of context structures that gets pushed when a
PL/Python starts being executed and popped when it returns. At first
they contained just a scratch memory context used by PLyDict_FromTuple.
Then under the premise of confirming the usefulness of introducing such
contexts I removed the global PLy_curr_procedure variable and changed
all users to get the current procedure from the context. It seems to
have worked, so the total count of global variables is unchanged - hooray!

While testing I found one more leak, this time caused by allocating a
structure for caching array type I/O functions and never freeing it.
Attached as separate patch.

Cheers,
Jan

Attachment Content-Type Size
plpython-execution-contexts.patch text/x-diff 17.6 KB
plpython-array-leak.patch text/x-diff 574 bytes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/python long-lived allocations in datum->dict transformation
Date: 2012-03-13 17:32:20
Message-ID: 12683.1331659940@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <wulczer(at)wulczer(dot)org> writes:
> I came up with a stack of context structures that gets pushed when a
> PL/Python starts being executed and popped when it returns. At first
> they contained just a scratch memory context used by PLyDict_FromTuple.
> Then under the premise of confirming the usefulness of introducing such
> contexts I removed the global PLy_curr_procedure variable and changed
> all users to get the current procedure from the context. It seems to
> have worked, so the total count of global variables is unchanged - hooray!

Applied with some adjustments --- mainly, I thought you were being
too incautious about ensuring that the stack got popped once it'd been
pushed. The easiest way to fix that was to do the pushes after the
SPI_connect calls, which required decoupling the behavior from
CurrentMemoryContext, which seemed like a good idea anyway.

> While testing I found one more leak, this time caused by allocating a
> structure for caching array type I/O functions and never freeing it.
> Attached as separate patch.

Applied also, but surely if we're leaking memory from the input
descriptors then we should worry about the output ones too?
I made it do that, but if that's wrong, somebody explain why.

regards, tom lane