Re: Does "preparing" a PreparedStatement really help?

Lists: pgsql-jdbc
From: Paul Tomblin <ptomblin(at)gmail(dot)com>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Does "preparing" a PreparedStatement really help?
Date: 2008-02-23 16:25:58
Message-ID: 47C04916.2070500@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Is there a performance advantage for preparing a PreparedStatement and keeping
it around and using it thousands of times rather than making a new Statement
every time? How big?

Back when I was doing Oracle Call Interface programming in C back in the mid
to late 1980s, we were always told that pre-parsing a query was very expensive
and so you tried not to do it very often, and once you'd done it, you stored
them to reuse. As I try to switch this system over to using a connection
pool, trying to store PreparedStatements for each connection is fairly
complicated and I'm wondering if it's worth it.

--
Paul Tomblin <ptomblin(at)xcski(dot)com> http://blog.xcski.com/
I wouldn't be surprised if I'd have to put garlic in the CD drawer
to really get rid of it.
-- Arthur van der Harg on 'Gator'


From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: Paul Tomblin <ptomblin(at)gmail(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Does "preparing" a PreparedStatement really help?
Date: 2008-02-23 18:14:58
Message-ID: A2C50787-E14C-4679-BCBB-96DA09D0B697@fastcrypt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Depending on the complexity of the statement, yes it does make a
difference.

You would have to measure it yourself. Your mileage would vary.

Dave
On 23-Feb-08, at 11:25 AM, Paul Tomblin wrote:

> Is there a performance advantage for preparing a PreparedStatement
> and keeping it around and using it thousands of times rather than
> making a new Statement every time? How big?
>
> Back when I was doing Oracle Call Interface programming in C back in
> the mid to late 1980s, we were always told that pre-parsing a query
> was very expensive and so you tried not to do it very often, and
> once you'd done it, you stored them to reuse. As I try to switch
> this system over to using a connection pool, trying to store
> PreparedStatements for each connection is fairly complicated and I'm
> wondering if it's worth it.
>
>
> --
> Paul Tomblin <ptomblin(at)xcski(dot)com> http://blog.xcski.com/
> I wouldn't be surprised if I'd have to put garlic in the CD drawer
> to really get rid of it.
> -- Arthur van der Harg on 'Gator'
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Paul Tomblin" <ptomblin(at)gmail(dot)com>
Cc: <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Does "preparing" a PreparedStatement really help?
Date: 2008-02-23 19:17:19
Message-ID: 47C0713F.7030804@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Paul Tomblin wrote:
> Is there a performance advantage for preparing a PreparedStatement and
> keeping it around and using it thousands of times rather than making a
> new Statement every time?

Yes.

> How big?

It depends. If your queries are not very complex, or the little bit of
extra CPU usage isn't a problem for you because the bottlenecks are
elsewhere, it might be completely insignificant for you. You would have
to test it with your application to known for sure.

> Back when I was doing Oracle Call Interface programming in C back in the
> mid to late 1980s, we were always told that pre-parsing a query was very
> expensive and so you tried not to do it very often, and once you'd done
> it, you stored them to reuse. As I try to switch this system over to
> using a connection pool, trying to store PreparedStatements for each
> connection is fairly complicated and I'm wondering if it's worth it.

Yeah, with a connection pool you can't really do it like that. You want
to use a technique called "statement caching", where the connection pool
/ driver keeps a cache of prepared statements, so that when you create a
new PreparedStatement and prepare it, it actually reuses an already
prepared one from the cache. Many if not most connection pool
implementations have a statement cache, but you might need to so
something to enable it; check the docs. There's also a stand-alone
statement cache implementation at
http://jdbccache.projects.postgresql.org/ which you can use.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: mljv(at)planwerk6(dot)de
To: pgsql-jdbc(at)postgresql(dot)org
Cc: Paul Tomblin <ptomblin(at)gmail(dot)com>
Subject: Re: Does "preparing" a PreparedStatement really help?
Date: 2008-02-24 12:50:15
Message-ID: 200802241350.15570.mljv@planwerk6.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Am Samstag 23 Februar 2008 17:25:58 schrieb Paul Tomblin:
> Is there a performance advantage for preparing a PreparedStatement and
> keeping it around and using it thousands of times rather than making a new
> Statement every time? How big?

i just experienced this topic the last two weeks so ican share my experience.

Summary: it makes a huge difference!

We have a java web application with hibernate as our ORM tool and c3p0 as
connection pool and a dedicated postgresql server with dual core amd and 8 GB
of ram. most of our queries are NOT very complicated to parse. Usually we use
hibernate generated queries which are most of the time very simple queries
like "select * from table where id = ?". But hibernate expands the query
string itself so it become rather huge like "Select table_0.attribute_0,
table0_attribute1 from table as table0 where table0.attribute0 = ?" and so
on. So i don't know if parsing attribute lists instead of * needs more
parsing time or not. We have some more complicated queries but not more than
3-4 joins and most of the time no difficult subqueries or other stuff. Our
application is serving about 30 Million PageViews per month and we have some
peak times where we need maximum performance.

i tell you all this stuff to let you compare with your environment.

By mistake we stopped using prepared Statements last week by setting
c3p0.maxStatements = 0. It was the only change in the application which could
affect the performance. What happend?

- CPU load was doubled from 25% to over 50%
- Incoming Network traffic doubled (because of sending full statements to the
DB backend instead of statement names)
- load average increased dramatically

We switched back to "prepared statements" and the monitoring graphs went back
to normal. Maybe it is just all about saving network traffic the cpu has to
handle, i don't know.

I really never saw something like this. if you read about performance, books
and documentations always say: "it depends on your environment" and of course
they are right. but i dont like it, just saying "its faster" or "its faster
depending on your environment".

i often have no clue what it means. of course you always have to check your
bottlenecks yourself but it would be nice if words like "faster" always show
up with an example benchmark, so you can compare with your environment.

when i first saw "preparing statements" i thought it might have no impact to
me as parsing stage takes only a few cpu cycles. but i was wrong. so i am
glad to share my experience with you.

But i guess it does not make a huge difference if you are not struggeling with
lots of concurrent users. Often the parsing time is only a few millisecond,
one example:

select * from member where member_name = 'Jim';
Time: 65.386 ms (average of 50 executions)

prepare s1 (text) AS select * from member where member_name = $1;
execute s1 ('Jim');
Time: 61.576 ms (average of 50 executions)

( i executed these a few time and the values are averages)

so you save 4 ms on this query. if you don't have concurrent users and the
whole application response is 400ms, you won't see any difference as 1%
performance gain is not "visible". But if you have 50 requests per second at
peak time, you save 200ms/second on your database server. so your CPU load
might drop by 20%.

but prepared statements has one disadvantage. If you use pg before 8.3 and
don't drop your pooled connection from time to time, the prepared plan will
be used forever regardless of new data by "ANALYZE". But this is no longer
true with 8.3. As i use 8.1 at the moment, i set a maximumLiveTime for each
pooled connection to be sure a query is replanned from time to time.

if you have any further questions about my szenario, don't hestitate to ask
me.

kind regards
Janning