Re: transaction control in pl/pgsql

From: Alban Hertroys <dalroi(at)solfertje(dot)student(dot)utwente(dot)nl>
To: Birgit Laggner <birgit(dot)laggner(at)vti(dot)bund(dot)de>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: transaction control in pl/pgsql
Date: 2010-04-01 15:06:12
Message-ID: 100D1105-ED49-4DCD-A705-A98D7565D329@solfertje.student.utwente.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 1 Apr 2010, at 14:38, Birgit Laggner wrote:

> Hi Alban,
>
> thanks for your detailed answer!
>
> My database settings:
> max connections: 20
> work_mem: 100MB
> shared buffers: 12000MB
>
> Server memory:
> physical 32GB
> total memory 50GB (incl. swap)
> shared memory ??

Ok, so max work_mem in use at any given time is 20 * 100MB ~ 2G. Considering you have 12G shared buffers and 50G available in total that's not your problem.

Btw, the amount of shared memory is something controlled from the kernel (you appear to be on a Linux system). If less than your 12G shared buffers would have been configured for kernel shared memory, then Postgres wouldn't start up at all, so there's probably no need to look at its value.

> I am not sure if I use deferred constraints - the only constraints I use
> in the function are NOT NULL constraints. But I don't believe my memory
> overflow had something to do with them, because the error occured during
> a loop. There should not happen any sudden changes regarding any NOT
> NULL constraints between one loop cycle and the next.

Deferred constraints are constraints that aren't checked until the end of the transaction. You can imagine keeping track of the things to check on commit can build up. But this doesn't apply to NOT NULL constraints as the documentation says:

DEFERRABLE
NOT DEFERRABLE
This controls whether the constraint can be deferred. A constraint that is not deferrable will be checked immediately after every command. Checking of constraints that are deferrable may be postponed until the end of the transaction (using the SET CONSTRAINTS command). NOT DEFERRABLE is the default. Only foreign key constraints currently accept this clause. All other constraint types are not deferrable.

> My function was the only connection to the database as far as I know -
> so, I don't think allocation of memory should be a reason...

Probably not, no.

> I would say: Yes, my function seems to store large amounts of data in
> memory. But in my function, I tried to store as much as possible of the
> interim results in real tables (not temp tables) instead of storing them
> in variables.

Good practice I think. Do any of those variables ever contain a large amount of data? I realised that since you have one single function many of your variables stay allocated until the final END statement. If that's what's causing you to run out of memory then it would help to split the function up into smaller ones, it would reduce the scope of such variables.

> But my guess is that postgres doesn't write the tables and
> therefore keeps everything in memory. (by the way: the swap was used up,
> too)

No it certainly doesn't do that, if you're operating on tables (like you say you do) then it writes such things to its WAL.

Maybe there's something else trying to keep lots of data in memory? Can you tell what is using all that memory?

What happens to the results of your function? Is that a large result set and is some piece of software on the same machine buffering all of that in memory, to display it for example? I recall psql can suffer from that problem, but it has a switch to turn that off (the name eludes me). Web-site scripts also have a tendency to handle data that way.

A few wild guesses; Did someone for example configure the WAL to be on a memory-disk? Is some other process churning up memory while you're running your function?

> It's really difficult to post only a part of the function, just because
> every next step is based on the result of the previous step. I also
> guess that every step on its own wouldn't cause memory overflow, but
> that it's more like the memory use adds up with every step. But I will
> try and cut the function into little snippets and let them run one for
> one - perhaps the memory overflows still occurs for one snippet...
>
> I you have any ideas ...
>
> Thanks again and regards,
>
> Birgit.
>
>
>
>
> On 01.04.2010 13:27, Alban Hertroys wrote:
>> On 1 Apr 2010, at 12:22, Birgit Laggner wrote:
>>
>>
>>> Dear list,
>>>
>>> I have some data (big size) and I've written a long function in pl/pgsql
>>> which processes the data in several steps. At a test run my function
>>> aborted because of memory exhaustion. My guess is, that everything what
>>> happens during the function transaction is stored in the memory - until
>>> it's full...
>>>
>> No, transactions operate on the database just like anything else. Running out of memory doing that is not impossible, but you'd need to create some really big transactions to reach those limits (I think I've been quoted the limit being 2^32 instructions per transaction a looong time ago, just to give you an indication).
>>
>> You're probably running out of memory for another reason. Now to figure out why...
>>
>> I've seen an issue with deferred constraints causing this, for example. Do you use deferred constraints?
>>
>> Another cause that pops up regularly is that people specify too much global work_mem for postgres. work_mem gets allocated per connection, so the more connections you have the more memory goes to work_mem and other resources may receive too little, or you cross what's available.
>> What are your relevant postgres settings (max connections, work_mem, shared mem, etc.) and how much physical, total and shared memory does your server have?
>>
>> Another possibility is that your function stores large amounts of data in variables that are not backed up by database tables. That means all that data will be stored in memory, and even if it goes to swap at some point (not sure it will, it would seriously hurt performance) there is a limit.
>> If this is the case, maybe you could use temporary tables to process that data instead of trying to do it all in memory.
>>
>>
>>> So, my idea for solving that problem would be to cut the
>>> big function into smaller functions. But, I don't want to write 30
>>> function calls at the end - I would rather like to have one function
>>> which is calling all these small functions, so I would only have to
>>> write one sql-query at the end.
>>>
>> Splitting up big functions into smaller functions is always a good idea. That's part of general programming paradigms. It won't cut down the size of your transaction though.
>>
>>
>>> What I fear is either, that, if this
>>> function calls the other functions, everything is only one trancaction
>>> again and I get memory overflow once more.
>>>
>>
>> If the problem persists, maybe you could post your function somewhere. As it's apparently a rather long function, can you strip it down to something that still causes it to run out of memory but that will be a bit easier for the people on this list to wade through?
>>
>> Alban Hertroys
>>
>> --
>> If you can't see the forest for the trees,
>> cut the trees and you'll see there is no forest.
>>
>>
>>
>>
>>
>>
>>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>
>
>

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.

!DSPAM:737,4bb4b67310411808037631!

In response to

Browse pgsql-general by date

  From Date Subject
Next Message tv 2010-04-01 15:39:37 Re: Postgres 9.1 - Release Theme
Previous Message Ozz Nixon 2010-04-01 15:05:21 Anyone using RubyRep replication?