Re: Cannot allocate memory for output buffer

Lists: pgsql-general
From: Pete Erickson <redlamb(at)redlamb(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Cannot allocate memory for output buffer
Date: 2009-11-27 21:34:51
Message-ID: 20091127163451.q63qzkfns40cccwg@imp.redlamb.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

I am looking for some help regarding an python OperationalError that I
recently received while executing a python script using sqlalchemy and
psycopg2. The python script parses an xml file stored on a networked
drive and enters the information into a pgsql database. Sometimes
these xml files reference a binary file which is also located on the
networked drive. These files are subsequently read in and stored in a
table along with the file's md5. The binary data is stored within a
bytea column. This script worked pretty well until recently when it
came across a binary file about 258MB in size. While reading the file
off the networked drive I received an OperationalError indicating that
it was unable to allocate memory for the output buffer. My initial
guess was that it ran out of memory, but according to the task manager
the machine had close to 2GB free when the error occurred.

I'm not 100% sure that this is a pgsql problem, but all Google
searches show the exact error message within libpq's source code which
is why I'm starting here.

The machine that runs the script is a Windows XP machine running
Python 2.6, PsycoPG2 2.0.10, and Postgresql 8.4. The database is
running pgsql 8.4 on a FreeBSD 8.0 box with approximately 16GB or
memory. I've been searching on Google and the mailing list archives
for the past couple days trying to find an answer with no success.

Any help is greatly appreciated. Thanks in advance.

--
Pete Erickson
redlamb _at_ redlamb _dot_ net


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pete Erickson <redlamb(at)redlamb(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Cannot allocate memory for output buffer
Date: 2009-11-27 22:15:41
Message-ID: 21100.1259360141@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Pete Erickson <redlamb(at)redlamb(dot)net> writes:
> I am looking for some help regarding an python OperationalError that I
> recently received while executing a python script using sqlalchemy and
> psycopg2. The python script parses an xml file stored on a networked
> drive and enters the information into a pgsql database. Sometimes
> these xml files reference a binary file which is also located on the
> networked drive. These files are subsequently read in and stored in a
> table along with the file's md5. The binary data is stored within a
> bytea column. This script worked pretty well until recently when it
> came across a binary file about 258MB in size. While reading the file
> off the networked drive I received an OperationalError indicating that
> it was unable to allocate memory for the output buffer. My initial
> guess was that it ran out of memory, but according to the task manager
> the machine had close to 2GB free when the error occurred.

Out of memory is probably exactly right. The textual representation of
arbitrary bytea data is normally several times the size of the raw bits
(worst case is 5x bigger, typical case perhaps half that). In addition
to that you have to consider that there are likely to be several copies
of the string floating around in your process' memory space. If you're
doing this in a 32bit environment it doesn't surprise me at all that
258MB of raw data would exhaust available memory.

Going to a 64bit implementation would help some, but I'm not sure that
that's an available option for you on Windows, and anyway it doesn't
eliminate the problem completely. If you want to process really large
binary files you're going to need to divide them into segments.

regards, tom lane


From: Peter Erickson <redlamb(at)redlamb(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Cannot allocate memory for output buffer
Date: 2009-11-27 22:55:30
Message-ID: 4B1058E2.4000002@redlamb.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Thanks. Out of curiosity, if memory exhaustion was the problem, any idea
why the task manager would show that I'm only using 1.2GB of the 3GB of
memory?

On 11/27/2009 5:15 PM, Tom Lane wrote:
> Pete Erickson <redlamb(at)redlamb(dot)net> writes:
>> I am looking for some help regarding an python OperationalError that I
>> recently received while executing a python script using sqlalchemy and
>> psycopg2. The python script parses an xml file stored on a networked
>> drive and enters the information into a pgsql database. Sometimes
>> these xml files reference a binary file which is also located on the
>> networked drive. These files are subsequently read in and stored in a
>> table along with the file's md5. The binary data is stored within a
>> bytea column. This script worked pretty well until recently when it
>> came across a binary file about 258MB in size. While reading the file
>> off the networked drive I received an OperationalError indicating that
>> it was unable to allocate memory for the output buffer. My initial
>> guess was that it ran out of memory, but according to the task manager
>> the machine had close to 2GB free when the error occurred.
>
> Out of memory is probably exactly right. The textual representation of
> arbitrary bytea data is normally several times the size of the raw bits
> (worst case is 5x bigger, typical case perhaps half that). In addition
> to that you have to consider that there are likely to be several copies
> of the string floating around in your process' memory space. If you're
> doing this in a 32bit environment it doesn't surprise me at all that
> 258MB of raw data would exhaust available memory.
>
> Going to a 64bit implementation would help some, but I'm not sure that
> that's an available option for you on Windows, and anyway it doesn't
> eliminate the problem completely. If you want to process really large
> binary files you're going to need to divide them into segments.
>
> regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Erickson <redlamb(at)redlamb(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Cannot allocate memory for output buffer
Date: 2009-11-27 23:12:29
Message-ID: 21941.1259363549@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Peter Erickson <redlamb(at)redlamb(dot)net> writes:
> Thanks. Out of curiosity, if memory exhaustion was the problem, any idea
> why the task manager would show that I'm only using 1.2GB of the 3GB of
> memory?

Well, it would've failed to allocate the next copy of the string that it
needed ... and I think also we try to keep the output buffer size at a
power of 2, so it might have been asking for 2GB for the output buffer.

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Peter Erickson <redlamb(at)redlamb(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general(at)postgresql(dot)org
Subject: Re: Cannot allocate memory for output buffer
Date: 2009-11-28 16:42:03
Message-ID: 20091128164203.GA10880@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Fri, Nov 27, 2009 at 05:55:30PM -0500, Peter Erickson wrote:
> Thanks. Out of curiosity, if memory exhaustion was the problem, any idea
> why the task manager would show that I'm only using 1.2GB of the 3GB of
> memory?

Note that what usually kills you first on 32-bit systems is that you
want 250MB of *contiguous* memory. Depending on how the program is
started and if you have memory map randomization, it becomes entirely
possible to have more than 1GB free, yet no blocks big enough to hold
250MB together.

64-bit systems don't suffer from this problem, the address space is so
large it's just not an issue.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.