Re: Memory leak in PL/pgSQL function which CREATE/SELECT/DROP a temporary table

Lists: pgsql-hackers
From: "MauMau" <maumau307(at)gmail(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: Memory leak in PL/pgSQL function which CREATE/SELECT/DROP a temporary table
Date: 2013-06-18 11:27:43
Message-ID: 214653D8DF574BFEAA6ED53E545E99E4@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I've encountered a memory leak problem when I use a PL/pgsql function which
creates and drops a temporary table. I couldn't find any similar problem in
the mailing list. I'd like to ask you whether this is a PostgreSQL's bug.
Maybe I should post this to pgsql-bugs or pgsql-general, but the discussion
is likely to involve the internal behavior of PostgreSQL, so let me post
here.

The steps to reproduce the problem is as follows. Please find attached two
files to use for this.
$ psql -d postgres -f myfunc.sql
$ ecpg myfunc.pgc
$ cc -I<pgsql_inst_dir>/include myfunc.c -o
myfunc -L<pg_inst_dir>/lib -lecpg
$ ./myfunc

As the program myfunc runs longer, the values of VSZ and RSS get bigger,
even after 50,000 transactions.

The cause of the memory increase appears to be CacheMemoryContext. When I
attached to postgres with gdb and ran "call
MemoryContextStats(TopMemoryContext)" several times, the size of
CacheMemoryContext kept increasing.

By the way, when I replace "SELECT COUNT(*) INTO cnt FROM mytable" in the
PL/pgSQL function with "INSERT INTO mytable VALUES(1)", the memory stops
increasing. So, the memory leak seems to occur when SELECT is used.

I know the solution -- add "IF NOT EXISTS" to the CREATE TEMPORARY TABLE.
That prevents memory increase. But why? What's wrong with my program? I'd
like to know:

Q1: Is this a bug of PostgreSQL?

Q2: If yes, is it planned to be included in the upcoming minor release?

Q3: If this is not a bug and a reasonable behavior, is it described
anywhere?

Regards
MauMau

Attachment Content-Type Size
myfunc.pgc application/octet-stream 795 bytes
myfunc.sql application/octet-stream 233 bytes

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Memory leak in PL/pgSQL function which CREATE/SELECT/DROP a temporary table
Date: 2013-06-18 12:48:44
Message-ID: 51C0572C.5080708@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 18.06.2013 14:27, MauMau wrote:
> The cause of the memory increase appears to be CacheMemoryContext. When
> I attached to postgres with gdb and ran "call
> MemoryContextStats(TopMemoryContext)" several times, the size of
> CacheMemoryContext kept increasing.

Hmm. I could repeat this, and it seems that the catcache for
pg_statistic accumulates negative cache entries. Those slowly take up
the memory.

Seems that we should somehow flush those, when the table is dropped. Not
sure how, but I'll take a look.

- Heikki


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Memory leak in PL/pgSQL function which CREATE/SELECT/DROP a temporary table
Date: 2013-06-18 18:07:59
Message-ID: 51C0A1FF.2050404@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 18.06.2013 15:48, Heikki Linnakangas wrote:
> On 18.06.2013 14:27, MauMau wrote:
>> The cause of the memory increase appears to be CacheMemoryContext. When
>> I attached to postgres with gdb and ran "call
>> MemoryContextStats(TopMemoryContext)" several times, the size of
>> CacheMemoryContext kept increasing.
>
> Hmm. I could repeat this, and it seems that the catcache for
> pg_statistic accumulates negative cache entries. Those slowly take up
> the memory.

Digging a bit deeper, this is a rather common problem with negative
catcache entries. In general, nothing stops you from polluting the cache
with as many negative cache entries as you like. Just do "select * from
table_that_doesnt_exist" for as many non-existent table names as you
want, for example. Those entries are useful at least in theory; they
speed up throwing the error the next time you try to query the same
non-existent table.

But there is a crucial difference in this case; the system created a
negative cache entry for the pg_statistic row of the table, but once the
relation is dropped, the cache entry keyed on the relation's OID, is
totally useless. It should be removed.

We have this problem with a few other catcaches too, which have what is
effectively a foreign key relationship with another catalog. For
example, the RELNAMENSP catcache is keyed on pg_class.relname,
pg_class.relnamespace, yet any negative entries are not cleaned up when
the schema is dropped. If you execute this repeatedly in a session:

CREATE SCHEMA foo;
SELECT * from foo.invalid; -- throws an error
DROP SCHEMA foo;

it will leak similarly to the original test case, but this time the leak
is into the RELNAMENSP catcache.

To fix that, I think we'll need to teach the catalog cache about the
relationships between the caches.

- Heikki


From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Heikki Linnakangas" <hlinnakangas(at)vmware(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memory leak in PL/pgSQL function which CREATE/SELECT/DROP a temporary table
Date: 2013-06-18 22:40:09
Message-ID: 9CABD6636F244DD7B0EFBE0336EA4E2E@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

From: "Heikki Linnakangas" <hlinnakangas(at)vmware(dot)com>
> On 18.06.2013 15:48, Heikki Linnakangas wrote:
>> Hmm. I could repeat this, and it seems that the catcache for
>> pg_statistic accumulates negative cache entries. Those slowly take up
>> the memory.
>
> Digging a bit deeper, this is a rather common problem with negative
> catcache entries. In general, nothing stops you from polluting the cache
> with as many negative cache entries as you like. Just do "select * from
> table_that_doesnt_exist" for as many non-existent table names as you want,
> for example. Those entries are useful at least in theory; they speed up
> throwing the error the next time you try to query the same non-existent
> table.

Really? Would the catcache be polluted with entries for nonexistent tables?
I'm surprised at this. I don't think it is necessary to speed up the query
that fails with nonexistent tables, because such queries should be
eliminated during application development.

> But there is a crucial difference in this case; the system created a
> negative cache entry for the pg_statistic row of the table, but once the
> relation is dropped, the cache entry keyed on the relation's OID, is
> totally useless. It should be removed.
>
> We have this problem with a few other catcaches too, which have what is
> effectively a foreign key relationship with another catalog. For example,
> the RELNAMENSP catcache is keyed on pg_class.relname,
> pg_class.relnamespace, yet any negative entries are not cleaned up when
> the schema is dropped. If you execute this repeatedly in a session:
>
> CREATE SCHEMA foo;
> SELECT * from foo.invalid; -- throws an error
> DROP SCHEMA foo;
>
> it will leak similarly to the original test case, but this time the leak
> is into the RELNAMENSP catcache.
>
> To fix that, I think we'll need to teach the catalog cache about the
> relationships between the caches.

Thanks for your concise explanation. Do you think it is difficult to fix
that bug? That sounds so to me... though I don't know the design of
catcaches yet.

Could you tell me the conditions where this bug occurs and how to avoid it?
I thought of the following:

[Condition]
1. Create and drop the same table repeatedly on the same session. Whether
the table is a temporary table is irrelevant.
2. Do SELECT against the table. INSERT/DELETE/UPDATE won't cause the
catcache leak.
3. Whether the processing happens in a PL/pgSQL function is irrelevant. The
leak occurs even when you do not use PL/pgSQL.

[Wordaround]
Use CREATE TABLE IF NOT EXISTS and TRUNCATE (or ON COMMIT DROP in case of
temporary tables) to avoid repeated creation/deletion of the same table.

Regards
MauMau


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memory leak in PL/pgSQL function which CREATE/SELECT/DROP a temporary table
Date: 2013-06-18 23:32:01
Message-ID: CAMkU=1wRqcs2xLs4=iDU=nXtJKr5kiiyv-Ev-Uvwd=u=gYCa_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 18, 2013 at 3:40 PM, MauMau <maumau307(at)gmail(dot)com> wrote:

> From: "Heikki Linnakangas" <hlinnakangas(at)vmware(dot)com>
>
>> On 18.06.2013 15:48, Heikki Linnakangas wrote:
>>
>>> Hmm. I could repeat this, and it seems that the catcache for
>>> pg_statistic accumulates negative cache entries. Those slowly take up
>>> the memory.
>>>
>>
>> Digging a bit deeper, this is a rather common problem with negative
>> catcache entries. In general, nothing stops you from polluting the cache
>> with as many negative cache entries as you like. Just do "select * from
>> table_that_doesnt_exist" for as many non-existent table names as you want,
>> for example. Those entries are useful at least in theory; they speed up
>> throwing the error the next time you try to query the same non-existent
>> table.
>>
>
> Really? Would the catcache be polluted with entries for nonexistent
> tables? I'm surprised at this. I don't think it is necessary to speed up
> the query that fails with nonexistent tables, because such queries should
> be eliminated during application development.

I was thinking the same thing, optimizing for failure is nice if there are
no tradeoffs, but not so nice if it leaks memory. But apparently the
negative cache was added for real reasons, not just theory. See discussion
from when it was added:

http://www.postgresql.org/message-id/19585.1012350724@sss.pgh.pa.us

Cheers,

Jeff


From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Jeff Janes" <jeff(dot)janes(at)gmail(dot)com>
Cc: "Heikki Linnakangas" <hlinnakangas(at)vmware(dot)com>, "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memory leak in PL/pgSQL function which CREATE/SELECT/DROP a temporary table
Date: 2013-06-19 12:47:01
Message-ID: FC539243082B4032858D618207185693@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

From: "Jeff Janes" <jeff(dot)janes(at)gmail(dot)com>
> On Tue, Jun 18, 2013 at 3:40 PM, MauMau <maumau307(at)gmail(dot)com> wrote:
>> Really? Would the catcache be polluted with entries for nonexistent
>> tables? I'm surprised at this. I don't think it is necessary to speed up
>> the query that fails with nonexistent tables, because such queries should
>> be eliminated during application development.
>
>
> I was thinking the same thing, optimizing for failure is nice if there are
> no tradeoffs, but not so nice if it leaks memory. But apparently the
> negative cache was added for real reasons, not just theory. See
> discussion
> from when it was added:
>
> http://www.postgresql.org/message-id/19585.1012350724@sss.pgh.pa.us

Thanks for the info. I (probably) understood why negative catcache entries
are necessary.

> Hmm. I could repeat this, and it seems that the catcache for
> pg_statistic accumulates negative cache entries. Those slowly take up
> the memory.
>
> Seems that we should somehow flush those, when the table is dropped. Not
> sure how, but I'll take a look.

As Heikki san said as above, there should be something wrong somewhere,
shouldn't there? In my testing, just repeating CREATE (TEMPORARY) TABLE,
SELECT against it, and DROP TABLE on it led to more than 400MB of
CacheMemoryContext, after which I stopped the test. It seems that the
catcache grows without bounds simply by repeating simple transactions.

I wish to know the conditions where this happens and take all workarounds in
my application to avoid the problem. Cooperation would be much appreciated.

Regards
MauMau


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: MauMau <maumau307(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Memory leak in PL/pgSQL function which CREATE/SELECT/DROP a temporary table
Date: 2014-01-25 21:36:56
Message-ID: 20140125213656.GG9750@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 18, 2013 at 09:07:59PM +0300, Heikki Linnakangas wrote:
> >Hmm. I could repeat this, and it seems that the catcache for
> >pg_statistic accumulates negative cache entries. Those slowly take up
> >the memory.
>
> Digging a bit deeper, this is a rather common problem with negative
> catcache entries. In general, nothing stops you from polluting the
> cache with as many negative cache entries as you like. Just do
> "select * from table_that_doesnt_exist" for as many non-existent
> table names as you want, for example. Those entries are useful at
> least in theory; they speed up throwing the error the next time you
> try to query the same non-existent table.
>
> But there is a crucial difference in this case; the system created a
> negative cache entry for the pg_statistic row of the table, but once
> the relation is dropped, the cache entry keyed on the relation's
> OID, is totally useless. It should be removed.
>
> We have this problem with a few other catcaches too, which have what
> is effectively a foreign key relationship with another catalog. For
> example, the RELNAMENSP catcache is keyed on pg_class.relname,
> pg_class.relnamespace, yet any negative entries are not cleaned up
> when the schema is dropped. If you execute this repeatedly in a
> session:
>
> CREATE SCHEMA foo;
> SELECT * from foo.invalid; -- throws an error
> DROP SCHEMA foo;
>
> it will leak similarly to the original test case, but this time the
> leak is into the RELNAMENSP catcache.
>
> To fix that, I think we'll need to teach the catalog cache about the
> relationships between the caches.

Is this a TODO?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: MauMau <maumau307(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Memory leak in PL/pgSQL function which CREATE/SELECT/DROP a temporary table
Date: 2014-01-27 07:43:51
Message-ID: 52E60E37.1010709@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/25/2014 11:36 PM, Bruce Momjian wrote:
> On Tue, Jun 18, 2013 at 09:07:59PM +0300, Heikki Linnakangas wrote:
>>> Hmm. I could repeat this, and it seems that the catcache for
>>> pg_statistic accumulates negative cache entries. Those slowly take up
>>> the memory.
>>
>> Digging a bit deeper, this is a rather common problem with negative
>> catcache entries. In general, nothing stops you from polluting the
>> cache with as many negative cache entries as you like. Just do
>> "select * from table_that_doesnt_exist" for as many non-existent
>> table names as you want, for example. Those entries are useful at
>> least in theory; they speed up throwing the error the next time you
>> try to query the same non-existent table.
>>
>> But there is a crucial difference in this case; the system created a
>> negative cache entry for the pg_statistic row of the table, but once
>> the relation is dropped, the cache entry keyed on the relation's
>> OID, is totally useless. It should be removed.
>>
>> We have this problem with a few other catcaches too, which have what
>> is effectively a foreign key relationship with another catalog. For
>> example, the RELNAMENSP catcache is keyed on pg_class.relname,
>> pg_class.relnamespace, yet any negative entries are not cleaned up
>> when the schema is dropped. If you execute this repeatedly in a
>> session:
>>
>> CREATE SCHEMA foo;
>> SELECT * from foo.invalid; -- throws an error
>> DROP SCHEMA foo;
>>
>> it will leak similarly to the original test case, but this time the
>> leak is into the RELNAMENSP catcache.
>>
>> To fix that, I think we'll need to teach the catalog cache about the
>> relationships between the caches.
>
> Is this a TODO?

Yes, I think so. Added.

- Heikki