Re: getBinaryStream and OutOfMemoryException

Lists: pgsql-jdbc
From: Jesper Thorhauge <jth(at)conzentrate(dot)com>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: getBinaryStream and OutOfMemoryException
Date: 2005-06-28 08:15:31
Message-ID: 42C10723.5010702@conzentrate.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Hi

I have a problem when retreiving relative large files from a bytea field
in the
database using getBinaryStream.

ResultSet res....
InputStream is = res.getBinaryStream("largefile_bytea");

causes OutOfMemoryException...

setBinaryStream also used to cause this problem but seems to be fixed
with the
8.0-release of the jdbc driver ( thanks!!! :-) ).

Whenever i use getBinaryStream on a file of size 9mb or larger it runs
out of
memory. I know i can just raise jvm heap using -Xmx, but it seems to me like
the streaming feature doesn't work properly. Looking into the cvs
checkout from
22-05-2005, it really seems to me that getBinaryStream is nothing more
than a
getBytes ...?? Searching the mailing list archive didn't give me the
answer, so
i hope someone out there is able to provide it.

My setup is;

- postgresql-8.0-311.jdbc3 driver
- postgresql 8.0
- jboss/tomcat using hibernate
- running on windows xp with jdk 1.5.0_02-b09

Regards
Jesper Thorhauge


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Jesper Thorhauge <jth(at)conzentrate(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: getBinaryStream and OutOfMemoryException
Date: 2005-06-28 13:06:13
Message-ID: 42C14B45.8000500@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Jesper Thorhauge wrote:
> Looking into the cvs
> checkout from
> 22-05-2005, it really seems to me that getBinaryStream is nothing more
> than a
> getBytes ...??

Correct. The current frontend/backend protocol doesn't really let us do
this differently, at least for bytea columns. It could be made more
memory-efficient (e.g. by using the binary result format rather than the
current text format) but the driver still has to pull the entire column
value across the wire at the time the row is returned.

Another approach would be to write large column values to disk, but
noone's written the code to do that yet; it also has some other problems
such as: what if you are running in an environment where you don't have
access to the disk?

-O


From: Jesper Thorhauge <jth(at)conzentrate(dot)com>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: getBinaryStream and OutOfMemoryException
Date: 2005-06-28 13:13:14
Message-ID: 42C14CEA.4020704@conzentrate.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Okay, thanks alot for the quick answer! I was wondering, do you know how
much more efficient it could be done using the binary format you are
mentioning? I'm especially interested in decreasing the amount of memory
used to get large bytea values...

/Jesper

Oliver Jowett wrote:

>Jesper Thorhauge wrote:
>
>
>>Looking into the cvs
>>checkout from
>>22-05-2005, it really seems to me that getBinaryStream is nothing more
>>than a
>>getBytes ...??
>>
>>
>
>Correct. The current frontend/backend protocol doesn't really let us do
>this differently, at least for bytea columns. It could be made more
>memory-efficient (e.g. by using the binary result format rather than the
>current text format) but the driver still has to pull the entire column
>value across the wire at the time the row is returned.
>
>Another approach would be to write large column values to disk, but
>noone's written the code to do that yet; it also has some other problems
>such as: what if you are running in an environment where you don't have
>access to the disk?
>
>-O
>
>


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Jesper Thorhauge <jth(at)conzentrate(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: getBinaryStream and OutOfMemoryException
Date: 2005-06-28 13:29:45
Message-ID: 42C150C9.70101@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Jesper Thorhauge wrote:
> Okay, thanks alot for the quick answer! I was wondering, do you know how
> much more efficient it could be done using the binary format you are
> mentioning? I'm especially interested in decreasing the amount of memory
> used to get large bytea values...

Um, rough estimate would be about a factor of 5 (assume text
representation of bytea averages about ~2.5 characters per raw byte, and
characters are 16 bits on the Java side)

-O


From: Jesper Thorhauge <jth(at)conzentrate(dot)com>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: getBinaryStream and OutOfMemoryException
Date: 2005-06-28 13:38:55
Message-ID: 42C152EF.305@conzentrate.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Okay, thats seems like a lot!! What are the priorities about getting
this modification into the driver? I have seen some other posts
regarding this binaryStream / memory problem, so it seems that i'm not
the only one :-)

Oliver Jowett wrote:

>Jesper Thorhauge wrote:
>
>
>>Okay, thanks alot for the quick answer! I was wondering, do you know how
>>much more efficient it could be done using the binary format you are
>>mentioning? I'm especially interested in decreasing the amount of memory
>>used to get large bytea values...
>>
>>
>
>Um, rough estimate would be about a factor of 5 (assume text
>representation of bytea averages about ~2.5 characters per raw byte, and
>characters are 16 bits on the Java side)
>
>-O
>
>---------------------------(end of broadcast)---------------------------
>TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>
>


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Jesper Thorhauge <jth(at)conzentrate(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: getBinaryStream and OutOfMemoryException
Date: 2005-06-28 22:19:20
Message-ID: 42C1CCE8.6060809@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Jesper Thorhauge wrote:
> Okay, thats seems like a lot!! What are the priorities about getting
> this modification into the driver? I have seen some other posts
> regarding this binaryStream / memory problem, so it seems that i'm not
> the only one :-)

There's no real priority; it'll happen when someone with time and
motivation implements it..

It's not trivial to do as the driver doesn't know ahead of time which
columns are bytea -- so either it must do an extra network roundtrip to
enquire about column types, or be ready to handle binary-format results
for *all* types.

-O