plpgsql leaking memory when stringifying datums

Lists: pgsql-hackers
From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: plpgsql leaking memory when stringifying datums
Date: 2012-02-05 19:07:22
Message-ID: 4F2ED36A.9020907@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

While chasing a PL/Python memory leak, I did a few tests with PL/pgSQL
and I think there are places where memory is not freed sufficiently early.

Attached are two functions that when run will make the backend's memory
consumption increase until they finish. With both, the cause is
convert_value_to_string that calls a datum's output function, which for
toasted data results in an allocation.

The proposed patch changes convert_value_to_string to call the output
function in the per-tuple memory context and then copy the result string
back to the original context.

The comment in that function says that callers generally pfree its
result, but that wasn't the case with exec_stmt_raise, so I added a
pfree() there as well.

With that I was still left with a leak in the typecast() test function
and it turns out that sticking a exec_eval_cleanup into exec_move_row
fixed it. The regression tests pass, but I'm not 100% sure if it's
actually safe.

Since convert_value_to_string needed to access the PL/pgSQL's execution
state to get its hands on the per-tuple context, I needed to pass it to
every caller that didn't have it already, which means exec_cast_value
and exec_simple_cast_value. Anyone has a better idea?

The initial diagnosis and proposed solution are by Andres Freund - thanks!

Cheers,
Jan

Attachment Content-Type Size
plpgsql-convert-value-leak.patch text/x-diff 13.7 KB
plpgsql-leaks.sql application/x-sql 529 bytes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql leaking memory when stringifying datums
Date: 2012-02-11 02:05:04
Message-ID: 29668.1328925904@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <wulczer(at)wulczer(dot)org> writes:
> While chasing a PL/Python memory leak, I did a few tests with PL/pgSQL
> and I think there are places where memory is not freed sufficiently early.

I think the basic issue here is that the type output function might
generate (and not bother to free) additional cruft besides its output
string, so that pfree'ing the output alone is not sufficient to avoid
a memory leak if the call occurs in a long-lived context.

However, I don't much care for the details of the proposed patch: if
we're going to fix this by running the output function in the per-tuple
memory context, and expecting the caller to do exec_eval_cleanup later,
why should we add extra pstrdup/pfree overhead? We can just leave the
result in the temp context in most cases, and thus get a net savings
rather than a net cost from fixing this. The attached modified patch
does it like that.

BTW, it occurs to me to wonder whether we need to worry about such
subsidiary leaks in type input functions as well. I see at least one
place where pl_exec.c is tediously freeing the result of
exec_simple_cast_value, but if there are secondary leaks that's not
going to be good enough. Maybe we should switch over to a similar
definition where the cast result is in the per-tuple context, and you've
got to copy it if you want it to be long-lived.

regards, tom lane

Attachment Content-Type Size
plpgsql-convert-value-leak-2.patch text/x-patch 14.9 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql leaking memory when stringifying datums
Date: 2012-02-11 20:10:01
Message-ID: 17454.1328991001@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> BTW, it occurs to me to wonder whether we need to worry about such
> subsidiary leaks in type input functions as well.

Sure enough, once you find an input function that leaks memory, there's
trouble:

create type myrow as (f1 text, f2 text, f3 text);

create or replace function leak_assign() returns void as $$
declare
t myrow[];
i int;
begin
for i in 1..10000000 loop
t := '{"(abcd,efg' || ',hij)", "(a,b,c)"}';
end loop;
end;
$$ language plpgsql;

So the attached third try also moves the input function calls in
exec_cast_value into the short-lived context, and rejiggers callers as
necessary to deal with that. This actually ends up simpler and probably
faster than the original coding, because we are able to get rid of some
ad-hoc data copying and pfree'ing, and most of the performance-critical
code paths already had exec_eval_cleanup calls anyway.

Also, you wrote:
>> With that I was still left with a leak in the typecast() test function
>> and it turns out that sticking a exec_eval_cleanup into exec_move_row
>> fixed it. The regression tests pass, but I'm not 100% sure if it's
>> actually safe.

After some study I felt pretty nervous about that too. It's safe enough
with the statement-level callers of exec_move_row, but there are several
calls from exec_assign_value, whose API contract says specifically that
it *won't* call exec_eval_cleanup. Even if it works today, that's a bug
waiting to happen. So I took the exec_eval_cleanup back out of
exec_move_row, and instead made all the statement-level callers do it.

I think this version is ready to go, so barring objections I'll set to
work on back-patching it.

regards, tom lane

Attachment Content-Type Size
plpgsql-io-function-leaks-3.patch text/x-patch 22.1 KB