Re: JBoss w/int8 primary keys in postgres ...

Lists: pgsql-jdbc
From: James Robinson <jlrobins(at)socialserve(dot)com>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: JBoss w/int8 primary keys in postgres ...
Date: 2003-09-06 18:38:40
Message-ID: 55BA76BA-E099-11D7-B5A3-000A9566A412@socialserve.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Greetings all,

Having just read a thread on the lack of implicit type cooersion by
postgresql when planning a query to use an index scan as opposed to a
table scan (thread over on psql-performance list at
http://archives.postgresql.org/pgsql-performance/2003-09/msg00090.php),
it hit me that the DB backing our EJB application has int8 primary keys
throughout (cooresponding to java datatype Long), and that the vast
majority of queries that JBoss is issuing with respect to pk values are
getting worst-case treatment as sequential scans as opposed to index
scans just from the int4 / int8 type mismatch on the primary key index.
Sample query generated by JBossCMP code:

LOG: query: SELECT id FROM address WHERE (contact=30410)

Here's the table:
social=# \d address;
Table "public.address"
Column | Type | Modifiers
--------------+---------+-----------
id | bigint | not null
name | text |
streetnumber | integer | not null
street | text |
street2 | text |
city | text |
state | text |
zip | text |
contact | bigint |
Indexes: pk_address primary key btree (id),
address_contact btree (contact)

Here's how it gets analyzed, since the 31410 gets treated by int4
naturally:

explain analyze SELECT id FROM address WHERE (contact=30410);
QUERY PLAN
------------------------------------------------------------------------
---------------------------
Seq Scan on address (cost=0.00..166.51 rows=1 width=8) (actual
time=17.41..17.41 rows=0 loops=1)
Filter: (contact = 30410)
Total runtime: 17.50 msec
(3 rows)

Explicitly casting the literal number to int8, making it match the
index type lets us do an index scan:

# explain analyze SELECT id FROM address WHERE (contact=30410::int8);
QUERY PLAN
------------------------------------------------------------------------
-----------------------------------------------
Index Scan using address_contact on address (cost=0.00..4.20 rows=1
width=8) (actual time=0.04..0.04 rows=0 loops=1)
Index Cond: (contact = 30410::bigint)
Total runtime: 0.12 msec
(3 rows)

A great bit better, of course.

Is there any hope to get JBoss + PG JDBC to do this typecasting
automatically? Assuming JBoss is using PreparedStatements for its dirty
work, could the setLong() method on the JDBC driver's PreparedStatement
implementation possibly always punch in the trailing '::int8' ?

Or does someone already know that JBossCMP does not use
PreparedStatements, and I'm really barking up the wrong tree here.

Thanks,

James


From: Dave Cramer <Dave(at)micro-automation(dot)net>
To: James Robinson <jlrobins(at)socialserve(dot)com>
Cc: "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JBoss w/int8 primary keys in postgres ...
Date: 2003-09-06 19:56:53
Message-ID: 1062878212.1598.607.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

James,

Have you run vacuum analyze on the database ?

Dave
On Sat, 2003-09-06 at 14:38, James Robinson wrote:
> Greetings all,
>
> Having just read a thread on the lack of implicit type cooersion by
> postgresql when planning a query to use an index scan as opposed to a
> table scan (thread over on psql-performance list at
> http://archives.postgresql.org/pgsql-performance/2003-09/msg00090.php),
> it hit me that the DB backing our EJB application has int8 primary keys
> throughout (cooresponding to java datatype Long), and that the vast
> majority of queries that JBoss is issuing with respect to pk values are
> getting worst-case treatment as sequential scans as opposed to index
> scans just from the int4 / int8 type mismatch on the primary key index.
> Sample query generated by JBossCMP code:
>
> LOG: query: SELECT id FROM address WHERE (contact=30410)
>
> Here's the table:
> social=# \d address;
> Table "public.address"
> Column | Type | Modifiers
> --------------+---------+-----------
> id | bigint | not null
> name | text |
> streetnumber | integer | not null
> street | text |
> street2 | text |
> city | text |
> state | text |
> zip | text |
> contact | bigint |
> Indexes: pk_address primary key btree (id),
> address_contact btree (contact)
>
> Here's how it gets analyzed, since the 31410 gets treated by int4
> naturally:
>
> explain analyze SELECT id FROM address WHERE (contact=30410);
> QUERY PLAN
> ------------------------------------------------------------------------
> ---------------------------
> Seq Scan on address (cost=0.00..166.51 rows=1 width=8) (actual
> time=17.41..17.41 rows=0 loops=1)
> Filter: (contact = 30410)
> Total runtime: 17.50 msec
> (3 rows)
>
> Explicitly casting the literal number to int8, making it match the
> index type lets us do an index scan:
>
> # explain analyze SELECT id FROM address WHERE (contact=30410::int8);
> QUERY PLAN
> ------------------------------------------------------------------------
> -----------------------------------------------
> Index Scan using address_contact on address (cost=0.00..4.20 rows=1
> width=8) (actual time=0.04..0.04 rows=0 loops=1)
> Index Cond: (contact = 30410::bigint)
> Total runtime: 0.12 msec
> (3 rows)
>
> A great bit better, of course.
>
> Is there any hope to get JBoss + PG JDBC to do this typecasting
> automatically? Assuming JBoss is using PreparedStatements for its dirty
> work, could the setLong() method on the JDBC driver's PreparedStatement
> implementation possibly always punch in the trailing '::int8' ?
>
> Or does someone already know that JBossCMP does not use
> PreparedStatements, and I'm really barking up the wrong tree here.
>
> Thanks,
>
> James
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>
--
Dave Cramer <Dave(at)micro-automation(dot)net>


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: James Robinson <jlrobins(at)socialserve(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: JBoss w/int8 primary keys in postgres ...
Date: 2003-09-06 22:25:30
Message-ID: 20030906222528.GD6486@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Sat, Sep 06, 2003 at 02:38:40PM -0400, James Robinson wrote:

> Is there any hope to get JBoss + PG JDBC to do this typecasting
> automatically? Assuming JBoss is using PreparedStatements for its dirty
> work, could the setLong() method on the JDBC driver's PreparedStatement
> implementation possibly always punch in the trailing '::int8' ?

Something like this was tried without much success. See:

http://archives.postgresql.org/pgsql-jdbc/2002-10/msg00163.php
http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/interfaces/jdbc/org/postgresql/jdbc1/AbstractJdbc1Statement.java#rev1.6
http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/interfaces/jdbc/org/postgresql/jdbc1/AbstractJdbc1Statement.java#rev1.12

Reading that thread, the change that was tried was quoting the value to get
an implicit cast from 'text' to the expected type -- whatever it was --
happening.

Perhaps it's reasonable to have setLong() always append ::int8 since that
does reflect the possible range of the parameter's type -- anyone more
familiar with this area want to comment? I guess that you end up with the
reverse problem of things that expect an int4 value but use setLong() will
break, but maybe that is less common and is going to have other problems
anyway if you try to use a java long that's larger than int4 can handle ..

-O


From: Paul Thomas <paul(at)tmsl(dot)demon(dot)co(dot)uk>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: James Robinson <jlrobins(at)socialserve(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: JBoss w/int8 primary keys in postgres ...
Date: 2003-09-07 10:44:02
Message-ID: 20030907114402.A21495@bacon
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc


On 06/09/2003 23:25 Oliver Jowett wrote:
> On Sat, Sep 06, 2003 at 02:38:40PM -0400, James Robinson wrote:
>
> > Is there any hope to get JBoss + PG JDBC to do this typecasting
> > automatically? Assuming JBoss is using PreparedStatements for its dirty
>
> > work, could the setLong() method on the JDBC driver's PreparedStatement
>
> > implementation possibly always punch in the trailing '::int8' ?
>
> Something like this was tried without much success. See:
>
> http://archives.postgresql.org/pgsql-jdbc/2002-10/msg00163.php
>
> http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/interfaces/jdbc/org/postgresql/jdbc1/AbstractJdbc1Statement.java#rev1.6
>
> http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/interfaces/jdbc/org/postgresql/jdbc1/AbstractJdbc1Statement.java#rev1.12
>
> Reading that thread, the change that was tried was quoting the value to
> get
> an implicit cast from 'text' to the expected type -- whatever it was --
> happening.
>
> Perhaps it's reasonable to have setLong() always append ::int8 since that
> does reflect the possible range of the parameter's type -- anyone more
> familiar with this area want to comment? I guess that you end up with the
> reverse problem of things that expect an int4 value but use setLong()
> will
> break, but maybe that is less common and is going to have other problems
> anyway if you try to use a java long that's larger than int4 can handle

Looking at the sources for the 7.3.4 driver, the code is almost there. It
looks like the type is carried across to org.postgresql.core.QueryExecutor
but is never send to the BE. I'm not familiar with the FE-BE protocol but
it just looks like we need to pass the m_bindTypes[] array across and
insert 2 lines of code to send :: followed by send m_bindTypes[i]. To
prevent breaking existing code, we would need make this the non-default
behavior. It would need to settable in the url string to be used in an EJB
container.

Comments?

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for the Smaller
Business |
| Computer Consultants |
http://www.thomas-micro-systems-ltd.co.uk |
+------------------------------+---------------------------------------------+


From: James Robinson <jlrobins(at)socialserve(dot)com>
To: Paul Thomas <paul(at)tmsl(dot)demon(dot)co(dot)uk>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: JBoss w/int8 primary keys in postgres ...
Date: 2003-09-07 19:09:23
Message-ID: CA913180-E166-11D7-9C55-000A9566A412@socialserve.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc


On Sunday, September 7, 2003, at 06:44 AM, Paul Thomas wrote:

> It would need to settable in the url string to be used in an EJB
> container.
>
> Comments?

Well, reading the associated threads more closely, I see that Tom
announced that there was a proposed fix in the backend for 7.4 (see
http://archives.postgresql.org/pgsql-jdbc/2002-10/msg00164.php). I'm
cooking 7.4B2 right now to see if anything is indeed different. But,
given the most recent discussion in performance-land, dated this past
Friday, I'm not expecting too much.

A backend fix would be the clearest answer, since the principle of
least surprise is being violated by the backend, not the JDBC driver.
I'm just not sure if that is/will be done in a timely enough fashon,
given I have this problem *now*, and our particular frontend-of-choice
can indeed inject a lossless cast in the setLong() path (and, likewise,
perhaps a '::int2' for the setShort() path as well).

CMP will be the death of us all.

James


From: Paul Thomas <paul(at)tmsl(dot)demon(dot)co(dot)uk>
To: James Robinson <jlrobins(at)socialserve(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: JBoss w/int8 primary keys in postgres ...
Date: 2003-09-08 14:18:55
Message-ID: 20030908151855.C32666@bacon
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On 07/09/2003 20:09 James Robinson wrote:
>
> On Sunday, September 7, 2003, at 06:44 AM, Paul Thomas wrote:
>
>> It would need to settable in the url string to be used in an EJB
>> container.
>>
>> Comments?
>
>
> Well, reading the associated threads more closely, I see that Tom
> announced that there was a proposed fix in the backend for 7.4 (see
> http://archives.postgresql.org/pgsql-jdbc/2002-10/msg00164.php). I'm
> cooking 7.4B2 right now to see if anything is indeed different. But,
> given the most recent discussion in performance-land, dated this past
> Friday, I'm not expecting too much.

Looking at the cvs HEAD sources, there's nothing there that would help you.

> A backend fix would be the clearest answer, since the principle of least
> surprise is being violated by the backend, not the JDBC driver. I'm just
> not sure if that is/will be done in a timely enough fashon, given I have
> this problem *now*, and our particular frontend-of-choice can indeed
> inject a lossless cast in the setLong() path (and, likewise, perhaps a
> '::int2' for the setShort() path as well).
>

Whether you consider PGs very tight typing a bug or not is a matter of
opinion. It probably would help if the BE could do type promotion
(int2->int4->int8). But that would only help those using that particular
version of PG. As the JDBC driver is designed to work for current + many
past releases of PG, modifications along the line I've proposed would also
help users of older versions of PG and provide fairly consistent behavior
across versions.

> CMP will be the death of us all.

No, just those who choose to go the EJB route. My preference is DAO.

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for the Smaller
Business |
| Computer Consultants |
http://www.thomas-micro-systems-ltd.co.uk |
+------------------------------+---------------------------------------------+


From: James Robinson <jlrobins(at)socialserve(dot)com>
To: Paul Thomas <paul(at)tmsl(dot)demon(dot)co(dot)uk>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: JBoss w/int8 primary keys in postgres ...
Date: 2003-09-09 00:57:04
Message-ID: 8774FF9C-E260-11D7-AC62-000A9566A412@socialserve.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc


On Monday, September 8, 2003, at 10:18 AM, Paul Thomas wrote:

> As the JDBC driver is designed to work for current + many past
> releases of PG, modifications along the line I've proposed would also
> help users of older versions of PG and provide fairly consistent
> behavior across versions.

I'm certainly not against such a proposal, making our
PreparedStatements smarter, but it would not help out in the JBossCMP
case, nor would it help out any other frontend, including the CLI psql
client, in that:

create table test (
pk int8 not null primary key,
name text;
);
copy into test values ( .... )

...
select name from test where id = 42;

just doesn't do what you would reasonably expect.

>
>> CMP will be the death of us all.
>
> No, just those who choose to go the EJB route. My preference is DAO.

I am enslaved to entity beans for now, and am ignorant for the most
part regarding the philosophies of both JDO and DAO. How would DAO get
around this issue?

Thanks,
James


From: Paul Thomas <paul(at)tmsl(dot)demon(dot)co(dot)uk>
To: James Robinson <jlrobins(at)socialserve(dot)com>
Cc: "pgsql-jdbc (at) postgresql (dot) org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JBoss w/int8 primary keys in postgres ...
Date: 2003-09-09 13:32:08
Message-ID: 20030909143208.C3136@bacon
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On 09/09/2003 01:57 James Robinson wrote:
>
> On Monday, September 8, 2003, at 10:18 AM, Paul Thomas wrote:
>
>> As the JDBC driver is designed to work for current + many past releases
>> of PG, modifications along the line I've proposed would also help users
>> of older versions of PG and provide fairly consistent behavior across
>> versions.
>
> I'm certainly not against such a proposal, making our PreparedStatements
> smarter, but it would not help out in the JBossCMP case, nor would it
> help out any other frontend, including the CLI psql client, in that:
>
> create table test (
> pk int8 not null primary key,
> name text;
> );
> copy into test values ( .... )
>
> ...
> select name from test where id = 42;
>
> just doesn't do what you would reasonably expect.

With the proposed modifications to the JDBC driver enabled

PreparedStatement pstmt = con.prepareStatement("select name from test
where id = ?");
pstmt.setLong(1, 42);
pstmt.executeQuery();

would actually produce:

select name from test where id = 42::int8

which _would_ cause an index to be used on your table.
>>> CMP will be the death of us all.
>>
>> No, just those who choose to go the EJB route. My preference is DAO.
>
> I am enslaved to entity beans for now, and am ignorant for the most part
> regarding the philosophies of both JDO and DAO. How would DAO get around
> this issue?

Take a look at the sample Adventure application on java.sun.com. It uses
DAOs.

HTH

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for the Smaller
Business |
| Computer Consultants |
http://www.thomas-micro-systems-ltd.co.uk |
+------------------------------+---------------------------------------------+


From: Rod Taylor <rbt(at)rbt(dot)ca>
To: Paul Thomas <paul(at)tmsl(dot)demon(dot)co(dot)uk>
Cc: James Robinson <jlrobins(at)socialserve(dot)com>, "pgsql-jdbc (at) postgresql (dot) org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JBoss w/int8 primary keys in postgres ...
Date: 2003-09-09 13:38:41
Message-ID: 1063114720.67981.9.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

> With the proposed modifications to the JDBC driver enabled
>
> PreparedStatement pstmt = con.prepareStatement("select name from test
> where id = ?");
> pstmt.setLong(1, 42);
> pstmt.executeQuery();
>
> would actually produce:
>
> select name from test where id = 42::int8

Silly question, but why don't you simply ensure any ?'s are quoted?

select name from test where id = '42'

This allows PostgreSQL to choose the best type for the job -- in this
case it will try to match that of id. It will work that way on all
older and newer versions of PostgreSQL.


From: Barry Lind <blind(at)xythos(dot)com>
To: Rod Taylor <rbt(at)rbt(dot)ca>
Cc: Paul Thomas <paul(at)tmsl(dot)demon(dot)co(dot)uk>, James Robinson <jlrobins(at)socialserve(dot)com>, "pgsql-jdbc (at) postgresql (dot) org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JBoss w/int8 primary keys in postgres ...
Date: 2003-09-09 17:39:24
Message-ID: 3F5E104C.30604@xythos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Rod,

That was tried once and needed to be pulled out. That change ended up
breaking some code since there are some circumstances where '5' is not
the same as 5 in the parser. I don't remember the details but they
should be in the maillist archives (or the cvs commit message for the
revert on the change).

thanks,
--Barry

PS. Because of having tried and failed to fix this in the driver
before, I am gun shy on trying a fix again at the driver level. The
server is the proper place for this to be fixed. And I think it is in
7.4, but haven't verified that myself.

Rod Taylor wrote:
>>With the proposed modifications to the JDBC driver enabled
>>
>>PreparedStatement pstmt = con.prepareStatement("select name from test
>>where id = ?");
>>pstmt.setLong(1, 42);
>>pstmt.executeQuery();
>>
>>would actually produce:
>>
>>select name from test where id = 42::int8
>
>
> Silly question, but why don't you simply ensure any ?'s are quoted?
>
> select name from test where id = '42'
>
> This allows PostgreSQL to choose the best type for the job -- in this
> case it will try to match that of id. It will work that way on all
> older and newer versions of PostgreSQL.


From: Paul Thomas <paul(at)tmsl(dot)demon(dot)co(dot)uk>
To: "pgsql-jdbc (at) postgresql (dot) org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JBoss w/int8 primary keys in postgres ...
Date: 2003-09-10 13:31:53
Message-ID: 20030910143153.I3136@bacon
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc


On 09/09/2003 18:39 Barry Lind wrote:
> Rod,
>
> That was tried once and needed to be pulled out. That change ended up
> breaking some code since there are some circumstances where '5' is not
> the same as 5 in the parser. I don't remember the details but they
> should be in the maillist archives (or the cvs commit message for the
> revert on the change).
>
> thanks,
> --Barry
>
> PS. Because of having tried and failed to fix this in the driver
> before, I am gun shy on trying a fix again at the driver level. The
> server is the proper place for this to be fixed. And I think it is in
> 7.4, but haven't verified that myself.

Is there any way we could find out for definite? Until this index/type
cast issue is sorted out I don't see any way that PostgreSQL can even be
considered ready for enterprise use. EJB/CMP is an important enterprise
technology and int8 index columns are not unknown in big company databases.

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for the Smaller
Business |
| Computer Consultants |
http://www.thomas-micro-systems-ltd.co.uk |
+------------------------------+---------------------------------------------+


From: Dave Cramer <Dave(at)micro-automation(dot)net>
To: Paul Thomas <paul(at)tmsl(dot)demon(dot)co(dot)uk>
Cc: "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JBoss w/int8 primary keys in postgres ...
Date: 2003-09-10 16:30:05
Message-ID: 1063211405.1637.33.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

There are perfectly valid reasons why postgres uses a sequence scan as
opposed to an index scan.

http://www.postgresql.org/docs/faqs/FAQ.html

That being said, the reason I asked if you had vacuumed the database
lately is that that is the only way to update the planner statistics.

So please vacuum analyze the entire db, if it then doesn't use an index
scan you can try disabling sequence scans in the configuration file.

Dave

On Wed, 2003-09-10 at 09:31, Paul Thomas wrote:
> On 09/09/2003 18:39 Barry Lind wrote:
> > Rod,
> >
> > That was tried once and needed to be pulled out. That change ended up
> > breaking some code since there are some circumstances where '5' is not
> > the same as 5 in the parser. I don't remember the details but they
> > should be in the maillist archives (or the cvs commit message for the
> > revert on the change).
> >
> > thanks,
> > --Barry
> >
> > PS. Because of having tried and failed to fix this in the driver
> > before, I am gun shy on trying a fix again at the driver level. The
> > server is the proper place for this to be fixed. And I think it is in
> > 7.4, but haven't verified that myself.
>
> Is there any way we could find out for definite? Until this index/type
> cast issue is sorted out I don't see any way that PostgreSQL can even be
> considered ready for enterprise use. EJB/CMP is an important enterprise
> technology and int8 index columns are not unknown in big company databases.
--
Dave Cramer <Dave(at)micro-automation(dot)net>