Re: Cursors performance (was: Re: [PERFORM] Terrible performance after deleting/recreating indexes)

Lists: pgsql-jdbcpgsql-performance
From: Bill Chandler <billybobc1210(at)yahoo(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Terrible performance after deleting/recreating indexes
Date: 2004-07-07 16:16:40
Message-ID: 20040707161640.48270.qmail@web51408.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

Hi,

Using PostgreSQL 7.4.2 on Solaris. I'm trying to
improve performance on some queries to my databases so
I wanted to try out various index structures.

Since I'm going to be running my performance tests
repeatedly, I created some SQL scripts to delete and
recreate various index configurations. One of the
scripts contains the commands for recreating the
'original' index configuration (i.e. the one I've
already got some numbers for). Only thing is now
when I delete and recreate the original indexes then
run the query, I'm finding the performance has gone
completely down the tubes compared to what I
originally had. A query that used to take 5 minutes
to complete now takes hours to complete.

For what it's worth my query looks something like:

select * from tbl_1, tbl_2 where tbl_1.id = tbl_2.id
and tbl_2.name like 'x%y%' and tbl_1.x > 1234567890123
order by tbl_1.x;

tbl_1 is very big (> 2 million rows)
tbl_2 is relatively small (7000 or so rows)
tbl_1.x is a numeric(13)
tbl_1.id & tbl_2.id are integers
tbl_2.name is a varchar(64)

I've run 'VACUUM ANALYZE' on both tables involved in
the query. I also used 'EXPLAIN' and observed that
the query plan is completely changed from what it
was originally.

Any idea why this would be? I would have thougth
that a freshly created index would have better
performance not worse. I have not done any inserts
or updates since recreating the indexes.

thanks in advance,

Bill C

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com


From: Shridhar Daithankar <shridhar(at)frodo(dot)hserus(dot)net>
To: Bill Chandler <billybobc1210(at)yahoo(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Terrible performance after deleting/recreating indexes
Date: 2004-07-08 09:07:37
Message-ID: 40ED0ED9.9000106@frodo.hserus.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

Bill Chandler wrote:

> Hi,
>
> Using PostgreSQL 7.4.2 on Solaris. I'm trying to
> improve performance on some queries to my databases so
> I wanted to try out various index structures.
>
> Since I'm going to be running my performance tests
> repeatedly, I created some SQL scripts to delete and
> recreate various index configurations. One of the
> scripts contains the commands for recreating the
> 'original' index configuration (i.e. the one I've
> already got some numbers for). Only thing is now
> when I delete and recreate the original indexes then
> run the query, I'm finding the performance has gone
> completely down the tubes compared to what I
> originally had. A query that used to take 5 minutes
> to complete now takes hours to complete.
>
> For what it's worth my query looks something like:
>
> select * from tbl_1, tbl_2 where tbl_1.id = tbl_2.id
> and tbl_2.name like 'x%y%' and tbl_1.x > 1234567890123
> order by tbl_1.x;
>
> tbl_1 is very big (> 2 million rows)
> tbl_2 is relatively small (7000 or so rows)
> tbl_1.x is a numeric(13)
> tbl_1.id & tbl_2.id are integers
> tbl_2.name is a varchar(64)
>
> I've run 'VACUUM ANALYZE' on both tables involved in
> the query. I also used 'EXPLAIN' and observed that
> the query plan is completely changed from what it
> was originally.

Get an explain analyze. That gives actual v/s planned time spent. See what is
causing the difference. A discrepency between planned and actual row is usually
a indication of out-of-date stats.

Which are the indexes on these tables? You should list fields with indexes first
in where clause. Also list most selective field first so that it eliminates as
many rows as possible in first scan.

I hope you have read the tuning articles on varlena.com and applied some basic
tuning.

And post the table schema, hardware config, postgresql config(important ones of
course) and explain analyze for queries. That would be something to start with.

Shridhar


From: Bill Chandler <billybobc1210(at)yahoo(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Terrible performance after deleting/recreating indexes
Date: 2004-07-08 20:49:21
Message-ID: 20040708204921.4921.qmail@web51407.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

Thanks for the advice.

On further review it appears I am only getting this
performance degradation when I run the command via
a JDBC app. If I do the exact same query from
psql, the performance is fine. I've tried both the
JDBC2 and JDBC3 jars. Same results.

It definitely seems to correspond to deleting and
recreating the indexes, though. The same query thru
JDBC worked fine before recreating the indexes.

Does that make any sense at all?

thanks

Bill

--- Shridhar Daithankar <shridhar(at)frodo(dot)hserus(dot)net>
wrote:
> Bill Chandler wrote:
>
> > Hi,
> >
> > Using PostgreSQL 7.4.2 on Solaris. I'm trying to
> > improve performance on some queries to my
> databases so
> > I wanted to try out various index structures.
> >
> > Since I'm going to be running my performance tests
> > repeatedly, I created some SQL scripts to delete
> and
> > recreate various index configurations. One of the
> > scripts contains the commands for recreating the
> > 'original' index configuration (i.e. the one I've
> > already got some numbers for). Only thing is now
> > when I delete and recreate the original indexes
> then
> > run the query, I'm finding the performance has
> gone
> > completely down the tubes compared to what I
> > originally had. A query that used to take 5
> minutes
> > to complete now takes hours to complete.
> >
> > For what it's worth my query looks something like:
> >
> > select * from tbl_1, tbl_2 where tbl_1.id =
> tbl_2.id
> > and tbl_2.name like 'x%y%' and tbl_1.x >
> 1234567890123
> > order by tbl_1.x;
> >
> > tbl_1 is very big (> 2 million rows)
> > tbl_2 is relatively small (7000 or so rows)
> > tbl_1.x is a numeric(13)
> > tbl_1.id & tbl_2.id are integers
> > tbl_2.name is a varchar(64)
> >
> > I've run 'VACUUM ANALYZE' on both tables involved
> in
> > the query. I also used 'EXPLAIN' and observed
> that
> > the query plan is completely changed from what it
> > was originally.
>
> Get an explain analyze. That gives actual v/s
> planned time spent. See what is
> causing the difference. A discrepency between
> planned and actual row is usually
> a indication of out-of-date stats.
>
>
> Which are the indexes on these tables? You should
> list fields with indexes first
> in where clause. Also list most selective field
> first so that it eliminates as
> many rows as possible in first scan.
>
>
> I hope you have read the tuning articles on
> varlena.com and applied some basic
> tuning.
>
> And post the table schema, hardware config,
> postgresql config(important ones of
> course) and explain analyze for queries. That would
> be something to start with.
>
> Shridhar
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: you can get off all lists at once with the
> unregister command
> (send "unregister YourEmailAddressHere" to
> majordomo(at)postgresql(dot)org)
>


__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - Send 10MB messages!
http://promotions.yahoo.com/new_mail


From: Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>
To: Bill Chandler <billybobc1210(at)yahoo(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Terrible performance after deleting/recreating indexes
Date: 2004-07-09 00:14:32
Message-ID: 40EDE368.90603@coretech.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

That is interesting - both psql and JDBC merely submit statements for
the backend to process, so generally you would expect no difference in
execution plan or performance.

It might be worth setting "log_statement=true" in postgresql.conf and
checking that you are executing *exactly* the same statement in both
JDBC and psql.

regards

Mark

P.s : lets see the output from EXPLAIN ANALYZE :-)

Bill Chandler wrote:

>Thanks for the advice.
>
>On further review it appears I am only getting this
>performance degradation when I run the command via
>a JDBC app. If I do the exact same query from
>psql, the performance is fine. I've tried both the
>JDBC2 and JDBC3 jars. Same results.
>
>
>
>
>
>
>


From: Bill Chandler <billybobc1210(at)yahoo(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Terrible performance after deleting/recreating indexes
Date: 2004-07-09 15:18:48
Message-ID: 20040709151848.89738.qmail@web51410.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

Thanks for this tip. Turns out there is a difference.
I am using cursors (i.e. calling setFetchSize(5000) on
my Statement) in JDBC. So the SQL statement is
preceded by:

DECLARE JDBC_CURS_1 CURSOR FOR ...

which is then followed by the SQL statemnt.

This is followed by the separate statement:

FETCH FORWARD 5000 FROM JDBC_CURS_1;

Also, don't know if this is significant but there
are a few lines before both of these:

set datestyle to 'ISO'; select version(), case when
pg_encoding_to_char(1) = 'SQL_ASCII' then 'UNKNOWN'
else getdatabaseencoding() end;
set client_encoding = 'UNICODE
begin;

Only thing is, though, none of this is new. I was
using cursors before as well.

Here is the output from "EXPLAIN ANALYZE". Hope it
comes out readable:

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=50466.04..50470.45 rows=1765 width=114)
(actual time=87237.003..88235.011 rows=108311 loops=1)
Sort Key: iso_nep_data_update_events.lds
-> Merge Join (cost=49240.03..50370.85 rows=1765
width=114) (actual time=56658.356..65221.995
rows=108311 loops=1)
Merge Cond: ("outer".obj_id = "inner".obj_id)
-> Sort (cost=198.01..198.16 rows=61
width=65) (actual time=175.947..181.172 rows=3768
loops=1)
Sort Key: iso_nep_control.obj_id
-> Seq Scan on iso_nep_control
(cost=0.00..196.20 rows=61 width=65) (actual
time=0.056..108.151 rows=3768 loops=1)
Filter: ((real_name)::text ~~
'NEPOOL%REAL%'::text)
-> Sort (cost=49042.02..49598.46
rows=222573 width=69) (actual
time=56482.073..58642.901 rows=216528 loops=1)
Sort Key:
iso_nep_data_update_events.obj_id
-> Index Scan using iso_nep_due_idx1
on iso_nep_data_update_events (cost=0.00..7183.18
rows=222573 width=69) (actual time=0.179..11739.104
rows=216671 loops=1)
Index Cond: (lds >
1088554754000::numeric)
Total runtime: 88643.330 ms
(13 rows)

Here is the actual query:

select iso_nep_DATA_UPDATE_EVENTS.lds,
iso_nep_DATA_UPDATE_EVENTS.tsds,
iso_nep_DATA_UPDATE_EVENTS.value,
iso_nep_DATA_UPDATE_EVENTS.correction,
iso_nep_DATA_UPDATE_EVENTS.delta_lds_tsds,
iso_nep_CONTROL.real_name,
iso_nep_CONTROL.freq,
iso_nep_CONTROL.type from
iso_nep_DATA_UPDATE_EVENTS, iso_nep_CONTROL
where iso_nep_CONTROL.real_name like
'NEPOOL%REAL%' escape '/' and
iso_nep_DATA_UPDATE_EVENTS.obj_id =
iso_nep_CONTROL.obj_id and
iso_nep_DATA_UPDATE_EVENTS.lds > 1088554754000
order by lds;

Two tables: iso_nep_data_update_events and
iso_nep_control. Basically getting all columns from
both tables. Joining the tables on obj_id = obj_id.
Have unique indexes on iso_nep_control.obj_id
(clustered) and iso_nep_control.real_name. Have
non-unique indexes on iso_nep_data_update_events.lds
and iso_nep_data_update_events.obj_id.

thanks,

Bill

--- Mark Kirkwood <markir(at)coretech(dot)co(dot)nz> wrote:
> That is interesting - both psql and JDBC merely
> submit statements for
> the backend to process, so generally you would
> expect no difference in
> execution plan or performance.
>
> It might be worth setting "log_statement=true" in
> postgresql.conf and
> checking that you are executing *exactly* the same
> statement in both
> JDBC and psql.
>
> regards
>
> Mark
>
> P.s : lets see the output from EXPLAIN ANALYZE :-)
>
> Bill Chandler wrote:
>
> >Thanks for the advice.
> >
> >On further review it appears I am only getting this
>
> >performance degradation when I run the command via
> >a JDBC app. If I do the exact same query from
> >psql, the performance is fine. I've tried both the
> >JDBC2 and JDBC3 jars. Same results.
> >
> >
> >
> >
> >
> >
> >
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>



__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - 100MB free storage!
http://promotions.yahoo.com/new_mail


From: Bill Chandler <billybobc1210(at)yahoo(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Cursors performance (was: Re: [PERFORM] Terrible performance after deleting/recreating indexes)
Date: 2004-07-09 20:24:16
Message-ID: 20040709202416.54013.qmail@web51401.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

Thanks to all who have responded. I now think my
problem is not related to deleting/recreating indexes.
Somehow it is related to JDBC cursors. It appears
that what is happening is that since I'm using
a fetch size of 5000, the command:

FETCH FORWARD 5000 FROM JDBC_CURS_1

is being repeatedly sent to the server as I process
the result set from my query. Each time this command
is sent it it takes about 5 minutes to return which is
about the amount of time the whole query took to
complete before the performance degredation. So in
other words it looks as if the full select is being
rerun on each fetch.

Now the mystery is why is this happening all of the
sudden? I have been running w/ fetch size set to 5000
for the last couple of weeks and it did not appear to
be doing this (i.e. re-running the entire select
statement again). Is this what I should expect when
using cursors? I would have thought that the server
should "remember" where it left off in the query since
the last fetch and continue from there.

Could I have inadvertently changed a parameter
somewhere that would cause this behavior?

thanks,

Bill

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com


From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: Bill Chandler <billybobc1210(at)yahoo(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org, "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Cursors performance (was: Re: [PERFORM] Terrible
Date: 2004-07-09 20:39:01
Message-ID: 1089405541.3645.289.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

Bill,

What happens if you do this in psql, also you can turn on duration
logging in the backend and log the queries.

dave
On Fri, 2004-07-09 at 16:24, Bill Chandler wrote:
> Thanks to all who have responded. I now think my
> problem is not related to deleting/recreating indexes.
> Somehow it is related to JDBC cursors. It appears
> that what is happening is that since I'm using
> a fetch size of 5000, the command:
>
> FETCH FORWARD 5000 FROM JDBC_CURS_1
>
> is being repeatedly sent to the server as I process
> the result set from my query. Each time this command
> is sent it it takes about 5 minutes to return which is
> about the amount of time the whole query took to
> complete before the performance degredation. So in
> other words it looks as if the full select is being
> rerun on each fetch.
>
> Now the mystery is why is this happening all of the
> sudden? I have been running w/ fetch size set to 5000
> for the last couple of weeks and it did not appear to
> be doing this (i.e. re-running the entire select
> statement again). Is this what I should expect when
> using cursors? I would have thought that the server
> should "remember" where it left off in the query since
> the last fetch and continue from there.
>
> Could I have inadvertently changed a parameter
> somewhere that would cause this behavior?
>
> thanks,
>
> Bill
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>
>
>
> !DSPAM:40eefff6170301475214189!
>
>
--
Dave Cramer
519 939 0336
ICQ # 14675561


From: Kris Jurka <books(at)ejurka(dot)com>
To: Bill Chandler <billybobc1210(at)yahoo(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Cursors performance (was: Re: [PERFORM] Terrible performance
Date: 2004-07-09 20:44:32
Message-ID: Pine.BSO.4.56.0407091542140.17911@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

On Fri, 9 Jul 2004, Bill Chandler wrote:

> Thanks to all who have responded. I now think my
> problem is not related to deleting/recreating indexes.
> Somehow it is related to JDBC cursors. It appears
> that what is happening is that since I'm using
> a fetch size of 5000, the command:
>
> FETCH FORWARD 5000 FROM JDBC_CURS_1
>

If the top level node of your execution plan is a sort step, it should
take essentially no time to retrieve additional rows after the first
fetch. The sort step is materializes the results so that future fetches
simply need to spit this data back to the client.

I would agree with Dave's suggestion to use log_duration and compare the
values for the first and subsequent fetches.

Kris Jurka


From: Bill Chandler <billybobc1210(at)yahoo(dot)com>
To: pg(at)fastcrypt(dot)com
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Cursors performance (was: Re: [PERFORM] Terrible performance after deleting/recreating indexes)
Date: 2004-07-09 21:03:48
Message-ID: 20040709210348.1813.qmail@web51406.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

Using psql it peforms exactly as I'd expect. The
rows get printed out to stdout, I hold down the space
bar to keep everything scrolling and as every 5000
rows go by I see a new fetch statement logged in the
server log. The results from this statement seem to
come back instaneously and the output starts scrolling
again immediately. Whole query takes a few minutes
to complete.

I seems like it has something to do w/ my JDBC app
but I can't think for the life of me what I might have
changed. Anyway, there's only the setFetchSize(5000)
and the setAutoCommit(false) that are relevant to
cursors, right? And those have been in there for
weeks.

Bill

--- Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
> Bill,
>
> What happens if you do this in psql, also you can
> turn on duration
> logging in the backend and log the queries.
>
> dave
> On Fri, 2004-07-09 at 16:24, Bill Chandler wrote:
> > Thanks to all who have responded. I now think my
> > problem is not related to deleting/recreating
> indexes.
> > Somehow it is related to JDBC cursors. It appears
> > that what is happening is that since I'm using
> > a fetch size of 5000, the command:
> >
> > FETCH FORWARD 5000 FROM JDBC_CURS_1
> >
> > is being repeatedly sent to the server as I
> process
> > the result set from my query. Each time this
> command
> > is sent it it takes about 5 minutes to return
> which is
> > about the amount of time the whole query took to
> > complete before the performance degredation. So in
> > other words it looks as if the full select is
> being
> > rerun on each fetch.
> >
> > Now the mystery is why is this happening all of
> the
> > sudden? I have been running w/ fetch size set to
> 5000
> > for the last couple of weeks and it did not appear
> to
> > be doing this (i.e. re-running the entire select
> > statement again). Is this what I should expect
> when
> > using cursors? I would have thought that the
> server
> > should "remember" where it left off in the query
> since
> > the last fetch and continue from there.
> >
> > Could I have inadvertently changed a parameter
> > somewhere that would cause this behavior?
> >
> > thanks,
> >
> > Bill
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam? Yahoo! Mail has the best spam
> protection around
> > http://mail.yahoo.com
> >
> > ---------------------------(end of
> broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please
> send an appropriate
> > subscribe-nomail command to
> majordomo(at)postgresql(dot)org so that your
> > message can get through to the mailing list
> cleanly
> >
> >
> >
> > !DSPAM:40eefff6170301475214189!
> >
> >
> --
> Dave Cramer
> 519 939 0336
> ICQ # 14675561
>
>

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com


From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: Bill Chandler <billybobc1210(at)yahoo(dot)com>
Cc: "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Cursors performance (was: Re: [PERFORM] Terrible
Date: 2004-07-09 21:10:57
Message-ID: 1089407457.1518.295.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

Ok, given that there are 5000 rows, the jdbc driver will actually fetch
all 5000 when you do the fetch, so is it the speed of the connection, or
the actual fetch that is taking the time, again, check the server logs
for duration.

Dave
On Fri, 2004-07-09 at 17:03, Bill Chandler wrote:
> Using psql it peforms exactly as I'd expect. The
> rows get printed out to stdout, I hold down the space
> bar to keep everything scrolling and as every 5000
> rows go by I see a new fetch statement logged in the
> server log. The results from this statement seem to
> come back instaneously and the output starts scrolling
> again immediately. Whole query takes a few minutes
> to complete.
>
> I seems like it has something to do w/ my JDBC app
> but I can't think for the life of me what I might have
> changed. Anyway, there's only the setFetchSize(5000)
> and the setAutoCommit(false) that are relevant to
> cursors, right? And those have been in there for
> weeks.
>
> Bill
>
> --- Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
> > Bill,
> >
> > What happens if you do this in psql, also you can
> > turn on duration
> > logging in the backend and log the queries.
> >
> > dave
> > On Fri, 2004-07-09 at 16:24, Bill Chandler wrote:
> > > Thanks to all who have responded. I now think my
> > > problem is not related to deleting/recreating
> > indexes.
> > > Somehow it is related to JDBC cursors. It appears
> > > that what is happening is that since I'm using
> > > a fetch size of 5000, the command:
> > >
> > > FETCH FORWARD 5000 FROM JDBC_CURS_1
> > >
> > > is being repeatedly sent to the server as I
> > process
> > > the result set from my query. Each time this
> > command
> > > is sent it it takes about 5 minutes to return
> > which is
> > > about the amount of time the whole query took to
> > > complete before the performance degredation. So in
> > > other words it looks as if the full select is
> > being
> > > rerun on each fetch.
> > >
> > > Now the mystery is why is this happening all of
> > the
> > > sudden? I have been running w/ fetch size set to
> > 5000
> > > for the last couple of weeks and it did not appear
> > to
> > > be doing this (i.e. re-running the entire select
> > > statement again). Is this what I should expect
> > when
> > > using cursors? I would have thought that the
> > server
> > > should "remember" where it left off in the query
> > since
> > > the last fetch and continue from there.
> > >
> > > Could I have inadvertently changed a parameter
> > > somewhere that would cause this behavior?
> > >
> > > thanks,
> > >
> > > Bill
> > >
> > > __________________________________________________
> > > Do You Yahoo!?
> > > Tired of spam? Yahoo! Mail has the best spam
> > protection around
> > > http://mail.yahoo.com
> > >
> > > ---------------------------(end of
> > broadcast)---------------------------
> > > TIP 3: if posting/reading through Usenet, please
> > send an appropriate
> > > subscribe-nomail command to
> > majordomo(at)postgresql(dot)org so that your
> > > message can get through to the mailing list
> > cleanly
> > >
> > >
> > >
> > >
> > >
> > >
> > --
> > Dave Cramer
> > 519 939 0336
> > ICQ # 14675561
> >
> >
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
>
>
> !DSPAM:40ef083f256273772718645!
>
>
--
Dave Cramer
519 939 0336
ICQ # 14675561


From: Bill Chandler <billybobc1210(at)yahoo(dot)com>
To: pg(at)fastcrypt(dot)com
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Cursors performance (was: Re: [PERFORM] Terrible performance after deleting/recreating indexes)
Date: 2004-07-09 21:18:01
Message-ID: 20040709211801.12966.qmail@web51406.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

Here are the result from "log_duration = true"

DECLARE/1st FETCH: 325014.881 ms
2nd FETCH: 324055.281 ms

--- Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
> Ok, given that there are 5000 rows, the jdbc driver
> will actually fetch
> all 5000 when you do the fetch, so is it the speed
> of the connection, or
> the actual fetch that is taking the time, again,
> check the server logs
> for duration.
>
> Dave
> On Fri, 2004-07-09 at 17:03, Bill Chandler wrote:
> > Using psql it peforms exactly as I'd expect. The
> > rows get printed out to stdout, I hold down the
> space
> > bar to keep everything scrolling and as every 5000
> > rows go by I see a new fetch statement logged in
> the
> > server log. The results from this statement seem
> to
> > come back instaneously and the output starts
> scrolling
> > again immediately. Whole query takes a few
> minutes
> > to complete.
> >
> > I seems like it has something to do w/ my JDBC app
> > but I can't think for the life of me what I might
> have
> > changed. Anyway, there's only the
> setFetchSize(5000)
> > and the setAutoCommit(false) that are relevant to
> > cursors, right? And those have been in there for
> > weeks.
> >
> > Bill
> >
> > --- Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
> > > Bill,
> > >
> > > What happens if you do this in psql, also you
> can
> > > turn on duration
> > > logging in the backend and log the queries.
> > >
> > > dave
> > > On Fri, 2004-07-09 at 16:24, Bill Chandler
> wrote:
> > > > Thanks to all who have responded. I now think
> my
> > > > problem is not related to deleting/recreating
> > > indexes.
> > > > Somehow it is related to JDBC cursors. It
> appears
> > > > that what is happening is that since I'm using
>
> > > > a fetch size of 5000, the command:
> > > >
> > > > FETCH FORWARD 5000 FROM JDBC_CURS_1
> > > >
> > > > is being repeatedly sent to the server as I
> > > process
> > > > the result set from my query. Each time this
> > > command
> > > > is sent it it takes about 5 minutes to return
> > > which is
> > > > about the amount of time the whole query took
> to
> > > > complete before the performance degredation.
> So in
> > > > other words it looks as if the full select is
> > > being
> > > > rerun on each fetch.
> > > >
> > > > Now the mystery is why is this happening all
> of
> > > the
> > > > sudden? I have been running w/ fetch size set
> to
> > > 5000
> > > > for the last couple of weeks and it did not
> appear
> > > to
> > > > be doing this (i.e. re-running the entire
> select
> > > > statement again). Is this what I should
> expect
> > > when
> > > > using cursors? I would have thought that the
> > > server
> > > > should "remember" where it left off in the
> query
> > > since
> > > > the last fetch and continue from there.
> > > >
> > > > Could I have inadvertently changed a parameter
> > > > somewhere that would cause this behavior?
> > > >
> > > > thanks,
> > > >
> > > > Bill
> > > >
> > > >
> __________________________________________________
> > > > Do You Yahoo!?
> > > > Tired of spam? Yahoo! Mail has the best spam
> > > protection around
> > > > http://mail.yahoo.com
> > > >
> > > > ---------------------------(end of
> > > broadcast)---------------------------
> > > > TIP 3: if posting/reading through Usenet,
> please
> > > send an appropriate
> > > > subscribe-nomail command to
> > > majordomo(at)postgresql(dot)org so that your
> > > > message can get through to the mailing
> list
> > > cleanly
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > --
> > > Dave Cramer
> > > 519 939 0336
> > > ICQ # 14675561
> > >
> > >
> >
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam? Yahoo! Mail has the best spam
> protection around
> > http://mail.yahoo.com
> >
> >
> >
> > !DSPAM:40ef083f256273772718645!
> >
> >
> --
> Dave Cramer
> 519 939 0336
> ICQ # 14675561
>
>


__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - Send 10MB messages!
http://promotions.yahoo.com/new_mail


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Bill Chandler <billybobc1210(at)yahoo(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Cursors performance
Date: 2004-07-09 21:55:35
Message-ID: 40EF1457.1060002@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

Bill Chandler wrote:

> Now the mystery is why is this happening all of the
> sudden? I have been running w/ fetch size set to 5000
> for the last couple of weeks and it did not appear to
> be doing this (i.e. re-running the entire select
> statement again). Is this what I should expect when
> using cursors? I would have thought that the server
> should "remember" where it left off in the query since
> the last fetch and continue from there.

I'd check heap size, GC activity (-verbose:gc), CPU use, swapping
activity on the *client* side. It may be that your dataset size or
physical memory or something similar has changed sufficiently that GC
resulting from the data in each 5k row batch is killing you.

Can you try a trivial app that runs the same query (with same fetchsize,
autocommit, etc) via JDBC and does nothing but steps forward through the
resultset, and see how fast it runs? Perhaps the problem is in your
processing logic.

-O


From: Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>
To: Bill Chandler <billybobc1210(at)yahoo(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Cursors performance
Date: 2004-07-10 03:06:20
Message-ID: 40EF5D2C.5080506@coretech.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

Might be worth doing a little test:

i) modify your code to fetch 1 row at a time
ii) set log_duration=true in your postgresql.conf (as the other posters
have suggested)

Then compare with running the query in psql.

regards

Mark

Bill Chandler wrote:

>Thanks to all who have responded. I now think my
>problem is not related to deleting/recreating indexes.
>Somehow it is related to JDBC cursors. It appears
>that what is happening is that since I'm using
>a fetch size of 5000, the command:
>
>FETCH FORWARD 5000 FROM JDBC_CURS_1
>
>is being repeatedly sent to the server as I process
>the result set from my query. Each time this command
>is sent it it takes about 5 minutes to return which is
>about the amount of time the whole query took to
>complete before the performance degredation. So in
>other words it looks as if the full select is being
>rerun on each fetch.
>
>Now the mystery is why is this happening all of the
>sudden? I have been running w/ fetch size set to 5000
>for the last couple of weeks and it did not appear to
>be doing this (i.e. re-running the entire select
>statement again). Is this what I should expect when
>using cursors? I would have thought that the server
>should "remember" where it left off in the query since
>the last fetch and continue from there.
>
>Could I have inadvertently changed a parameter
>somewhere that would cause this behavior?
>
>thanks,
>
>Bill
>
>__________________________________________________
>Do You Yahoo!?
>Tired of spam? Yahoo! Mail has the best spam protection around
>http://mail.yahoo.com
>
>---------------------------(end of broadcast)---------------------------
>TIP 8: explain analyze is your friend
>
>


From: Bill Chandler <billybobc1210(at)yahoo(dot)com>
To: Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>
Cc: pgsql-jdbc(at)postgresql(dot)org, pgsql-perform(at)postgresql(dot)org
Subject: Re: [PERFORM] Cursors performance
Date: 2004-07-12 18:07:29
Message-ID: 20040712180730.37055.qmail@web51407.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

Thanks,

Will try this test (I'm assuming you mean to say to
set fetch size of 1 and rerun on both JDBC and
psql).

BTW, here is another clue: I only get the JDBC
performance degradation when I include the "real_name
like 'NEPOOL%REAL%'" clause. I've tried re-ordering
too: i.e. putting this clause first in the statement,
last in the statement, etc. Doesn't seem to make any
difference.

real_name is a varchar(64). There is a unique index
on it.

thanks,

Bill

--- Mark Kirkwood <markir(at)coretech(dot)co(dot)nz> wrote:
> Might be worth doing a little test:
>
> i) modify your code to fetch 1 row at a time
> ii) set log_duration=true in your postgresql.conf
> (as the other posters
> have suggested)
>
> Then compare with running the query in psql.
>
> regards
>
> Mark
>
>
>
> Bill Chandler wrote:
>
> >Thanks to all who have responded. I now think my
> >problem is not related to deleting/recreating
> indexes.
> >Somehow it is related to JDBC cursors. It appears
> >that what is happening is that since I'm using
> >a fetch size of 5000, the command:
> >
> >FETCH FORWARD 5000 FROM JDBC_CURS_1
> >
> >is being repeatedly sent to the server as I process
> >the result set from my query. Each time this
> command
> >is sent it it takes about 5 minutes to return which
> is
> >about the amount of time the whole query took to
> >complete before the performance degredation. So in
> >other words it looks as if the full select is being
> >rerun on each fetch.
> >
> >Now the mystery is why is this happening all of the
> >sudden? I have been running w/ fetch size set to
> 5000
> >for the last couple of weeks and it did not appear
> to
> >be doing this (i.e. re-running the entire select
> >statement again). Is this what I should expect
> when
> >using cursors? I would have thought that the
> server
> >should "remember" where it left off in the query
> since
> >the last fetch and continue from there.
> >
> >Could I have inadvertently changed a parameter
> >somewhere that would cause this behavior?
> >
> >thanks,
> >
> >Bill
> >
> >__________________________________________________
> >Do You Yahoo!?
> >Tired of spam? Yahoo! Mail has the best spam
> protection around
> >http://mail.yahoo.com
> >
> >---------------------------(end of
> broadcast)---------------------------
> >TIP 8: explain analyze is your friend
> >
> >
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>
> http://www.postgresql.org/docs/faqs/FAQ.html
>


__________________________________
Do you Yahoo!?
Yahoo! Mail - 50x more storage than other providers!
http://promotions.yahoo.com/new_mail


From: Barry Lind <blind(at)xythos(dot)com>
To: Bill Chandler <billybobc1210(at)yahoo(dot)com>
Cc: Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>, pgsql-jdbc(at)postgresql(dot)org, pgsql-perform(at)postgresql(dot)org
Subject: Re: [PERFORM] Cursors performance
Date: 2004-07-12 21:05:12
Message-ID: 40F2FD08.7020106@xythos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

Bill,

I suspect that this is an artifact of using server side prepared
statements. When testing this via psql you will be forming sql like:

select ...
from ...
where ...
and real_name like 'NEPOOL%REAL%'
...

but the JDBC driver with server side prepared statements is doing:

select ...
from ...
where ...
and real_name like ?
...

So when the statement is prepared, since it doesn't know what values you
are going to use in the bind variable, it will generally take a more
concervative execution plan than if it knows what the bind variable is.

So I suspect the performance difference is just in the different
execution plans for the two different forms of the sql statement.

thanks,
--Barry

Bill Chandler wrote:
> Thanks,
>
> Will try this test (I'm assuming you mean to say to
> set fetch size of 1 and rerun on both JDBC and
> psql).
>
> BTW, here is another clue: I only get the JDBC
> performance degradation when I include the "real_name
> like 'NEPOOL%REAL%'" clause. I've tried re-ordering
> too: i.e. putting this clause first in the statement,
> last in the statement, etc. Doesn't seem to make any
> difference.
>
> real_name is a varchar(64). There is a unique index
> on it.
>
> thanks,
>
> Bill
>
> --- Mark Kirkwood <markir(at)coretech(dot)co(dot)nz> wrote:
>
>>Might be worth doing a little test:
>>
>>i) modify your code to fetch 1 row at a time
>>ii) set log_duration=true in your postgresql.conf
>>(as the other posters
>>have suggested)
>>
>>Then compare with running the query in psql.
>>
>>regards
>>
>>Mark
>>
>>
>>
>>Bill Chandler wrote:
>>
>>
>>>Thanks to all who have responded. I now think my
>>>problem is not related to deleting/recreating
>>
>>indexes.
>>
>>>Somehow it is related to JDBC cursors. It appears
>>>that what is happening is that since I'm using
>>>a fetch size of 5000, the command:
>>>
>>>FETCH FORWARD 5000 FROM JDBC_CURS_1
>>>
>>>is being repeatedly sent to the server as I process
>>>the result set from my query. Each time this
>>
>>command
>>
>>>is sent it it takes about 5 minutes to return which
>>
>>is
>>
>>>about the amount of time the whole query took to
>>>complete before the performance degredation. So in
>>>other words it looks as if the full select is being
>>>rerun on each fetch.
>>>
>>>Now the mystery is why is this happening all of the
>>>sudden? I have been running w/ fetch size set to
>>
>>5000
>>
>>>for the last couple of weeks and it did not appear
>>
>>to
>>
>>>be doing this (i.e. re-running the entire select
>>>statement again). Is this what I should expect
>>
>>when
>>
>>>using cursors? I would have thought that the
>>
>>server
>>
>>>should "remember" where it left off in the query
>>
>>since
>>
>>>the last fetch and continue from there.
>>>
>>>Could I have inadvertently changed a parameter
>>>somewhere that would cause this behavior?
>>>
>>>thanks,
>>>
>>>Bill
>>>
>>>__________________________________________________
>>>Do You Yahoo!?
>>>Tired of spam? Yahoo! Mail has the best spam
>>
>>protection around
>>
>>>http://mail.yahoo.com
>>>
>>>---------------------------(end of
>>
>>broadcast)---------------------------
>>
>>>TIP 8: explain analyze is your friend
>>>
>>>
>>
>>---------------------------(end of
>>broadcast)---------------------------
>>TIP 5: Have you checked our extensive FAQ?
>>
>>
>>http://www.postgresql.org/docs/faqs/FAQ.html
>>
>
>
>
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Mail - 50x more storage than other providers!
> http://promotions.yahoo.com/new_mail
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org


From: Kris Jurka <books(at)ejurka(dot)com>
To: Barry Lind <blind(at)xythos(dot)com>
Cc: Bill Chandler <billybobc1210(at)yahoo(dot)com>, Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>, pgsql-jdbc(at)postgresql(dot)org, pgsql-perform(at)postgresql(dot)org
Subject: Re: [PERFORM] Cursors performance
Date: 2004-07-12 21:11:53
Message-ID: Pine.BSO.4.56.0407121610180.24837@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

On Mon, 12 Jul 2004, Barry Lind wrote:

> Bill,
>
> I suspect that this is an artifact of using server side prepared
> statements. When testing this via psql you will be forming sql like:

I don't think so. The 7.4 driver can use either cursors or server
prepared statements, not both. He's definitely using cursors, so I server
prepared statements don't come into the mix here.

Kris Jurka


From: Bill Chandler <billybobc1210(at)yahoo(dot)com>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org, pgsql-perform(at)postgresql(dot)org
Subject: Re: [PERFORM] Cursors performance
Date: 2004-07-13 16:13:46
Message-ID: 20040713161346.62000.qmail@web51404.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

All,

Looks like I may have beaten this one to death. May
have to chalk it up to a limitation for now due to
deadlines and revisit it later.

One final clue before I go: if I change my wildcard to
'NEPOOL%' from 'NEPOOL%REAL%' my query completes much
faster. Of course this makes sense since it's much
easier to search a string for a prefix than it is to
do a complex regular expression match. I just didn't
expect it to be orders of magnitude difference.

The table containing the string being searched is only
7500 rows but I am joining it with a table with
2.5 million rows. So maybe there's something I can do
to do the wildcard search on the smaller table first
then do the join.

Ok, thanks again to all who responded. Really
appreciate the tips on logging statements and
duration, etc.

regards,

Bill
--- Kris Jurka <books(at)ejurka(dot)com> wrote:
>
>
> On Mon, 12 Jul 2004, Barry Lind wrote:
>
> > Bill,
> >
> > I suspect that this is an artifact of using server
> side prepared
> > statements. When testing this via psql you will
> be forming sql like:
>
> I don't think so. The 7.4 driver can use either
> cursors or server
> prepared statements, not both. He's definitely
> using cursors, so I server
> prepared statements don't come into the mix here.
>
> Kris Jurka
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
> majordomo(at)postgresql(dot)org
>


__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - Send 10MB messages!
http://promotions.yahoo.com/new_mail


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bill Chandler <billybobc1210(at)yahoo(dot)com>
Cc: Kris Jurka <books(at)ejurka(dot)com>, pgsql-jdbc(at)postgresql(dot)org, pgsql-perform(at)postgresql(dot)org
Subject: Re: [PERFORM] Cursors performance
Date: 2004-07-13 17:08:57
Message-ID: 16994.1089738537@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc pgsql-performance

Bill Chandler <billybobc1210(at)yahoo(dot)com> writes:
> One final clue before I go: if I change my wildcard to
> 'NEPOOL%' from 'NEPOOL%REAL%' my query completes much
> faster.

Could we see the exact queries and EXPLAIN ANALYZE output for both
cases? I'm wondering if the plan changes. I think that the planner
will believe that the latter pattern is significantly more selective
(how much more selective depends on exactly which PG version you're
using); if this results in a bad row-count estimate then a bad plan
could get picked.

regards, tom lane