bit datatype and getObject()

Lists: pgsql-jdbc
From: Thomas Kellerer <spam_eater(at)gmx(dot)net>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: bit datatype and getObject()
Date: 2010-11-23 21:55:57
Message-ID: ichd9g$e1g$1@dough.gmane.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Hi,

the JDBC driver seems to map columns defined as "bit" to Boolean regardless of the definition of the column. So even a bit(5) is returned as boolean value which I think is not correct.

Consider the following

CREATE TABLE bit_test (some_bits bit(5));
INSERT INTO bit_test VALUES ('01010');

When I run a select in psql it will return 01010 as the column's value.

When running a select from within Java and using getObject() on that column a java.lang.Boolean is returned.

This might be correct for a bit(1) but not for columns defined with a bigger width.

It is returned correctly when using getString() on that column though.
But for applications using the generic getObject() this returns a wrong value.

I'm not sure what the JDBC specs requires here, but this mapping somehow is confusing.

I tested this with PG 9.0 and the JDBC4 build 801 driver.

Regards
Thomas


From: dmp <danap(at)ttc-cmc(dot)net>
To: Thomas Kellerer <spam_eater(at)gmx(dot)net>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: bit datatype and getObject()
Date: 2010-11-24 03:06:11
Message-ID: 4CEC8123.4030503@ttc-cmc.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Thomas Kellerer wrote:
> Hi,
>
> the JDBC driver seems to map columns defined as "bit" to Boolean
> regardless of the definition of the column. So even a bit(5) is returned
> as boolean value which I think is not correct.
>
> Consider the following
>
> CREATE TABLE bit_test (some_bits bit(5));
> INSERT INTO bit_test VALUES ('01010');
>
> When I run a select in psql it will return 01010 as the column's value.
>
> When running a select from within Java and using getObject() on that
> column a java.lang.Boolean is returned.
>
> This might be correct for a bit(1) but not for columns defined with a
> bigger width.
>
> It is returned correctly when using getString() on that column though.
> But for applications using the generic getObject() this returns a wrong
> value.
>
> I'm not sure what the JDBC specs requires here, but this mapping somehow
> is confusing.
>
> I tested this with PG 9.0 and the JDBC4 build 801 driver.
>
> Regards
> Thomas

It is true that the mapping is java.lang.Boolean for the bit(x), but
there is another type in PostgreSQL to define a multi-bit type. Have
you tried: bit varying(5) retrieval by getObject()? Because that type
does map to java.lang.Object. The PostgreSQL manual 8.10 though indicates
that both are: "Bit String Types". Seems appropriate that getString()
then is how to retrieve these types. Either way isn't this a server
issue on the mapping, not the JDBC? I'm not sure but does not the JDBC
just return the mapping from the server?

Attached current data types mapping for majority of data types of server
9.0.1 and JDBC 9.0-801.

danap.

Attachment Content-Type Size
posgtresql_9.01_datatype_mappings.txt text/plain 1.9 KB

From: Kris Jurka <books(at)ejurka(dot)com>
To: Thomas Kellerer <spam_eater(at)gmx(dot)net>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: bit datatype and getObject()
Date: 2010-11-24 06:31:13
Message-ID: alpine.BSO.2.00.1011240122570.13088@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Tue, 23 Nov 2010, Thomas Kellerer wrote:

> the JDBC driver seems to map columns defined as "bit" to Boolean regardless
> of the definition of the column. So even a bit(5) is returned as boolean
> value which I think is not correct.
>
> Consider the following
>
> CREATE TABLE bit_test (some_bits bit(5));
> INSERT INTO bit_test VALUES ('01010');
>
> When I run a select in psql it will return 01010 as the column's value.
>
> When running a select from within Java and using getObject() on that column a
> java.lang.Boolean is returned.
>
> This might be correct for a bit(1) but not for columns defined with a bigger
> width.
>
> I'm not sure what the JDBC specs requires here, but this mapping somehow is
> confusing.

Yeah, losing data is not good. Previously the discussion on list was to
use java.util.BitSet, but it never resulted in a patch. I'm not sure if
we can retain backwards compatibility for single bit length bitstrings.
So we'd have to determine whether we can maintain compatibility and if
not, is it still worth it to use BitSet.

Kris Jurka


From: Thomas Kellerer <spam_eater(at)gmx(dot)net>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: bit datatype and getObject()
Date: 2010-11-24 08:00:58
Message-ID: icigo3$fs0$1@dough.gmane.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

dmp, 24.11.2010 04:06:

> It is true that the mapping is java.lang.Boolean for the bit(x), but
> there is another type in PostgreSQL to define a multi-bit type. Have
> you tried: bit varying(5) retrieval by getObject()? Because that type
> does map to java.lang.Object.

Correct.

The only thing I noticed is that it returns Types.OTHER as the datatype, not Types.BIT
But as I don't know if Types.BIT is meant to include "multi-bit" types I don't know if
that is correct or not.

> Either way isn't this a server issue on the mapping, not the JDBC?

I don't think so. psql displays this correctly.

Regards
Thomas


From: Kris Jurka <books(at)ejurka(dot)com>
To: Thomas Kellerer <spam_eater(at)gmx(dot)net>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: bit datatype and getObject()
Date: 2010-12-22 19:24:57
Message-ID: alpine.BSO.2.00.1012221358080.19025@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 24 Nov 2010, Kris Jurka wrote:

> On Tue, 23 Nov 2010, Thomas Kellerer wrote:
>
>> the JDBC driver seems to map columns defined as "bit" to Boolean regardless
>> of the definition of the column. So even a bit(5) is returned as boolean
>> value which I think is not correct.
>>
>
> Yeah, losing data is not good. Previously the discussion on list was to use
> java.util.BitSet, but it never resulted in a patch.
>

I've looked at this in some more detail and I don't think BitSet matches
up closely enough with bit(n)/varbit to make it work. BitSet does not
store the total bit string length, only the length until the highest set
bit. So if you have a BitSet object you can make a safe roundtrip to the
database and back, but you can't make a safe roundtrip from the database
to a BitSet and back. If you have a bit/varbit field that has a zero
in the most significant bit, it will not go back to the database the same.

This causes some problems on the database side, for example:

CREATE TEMP TABLE bittest (a bit(3));
INSERT INTO bittest VALUES ('010');

Once you pull this into a BitSet object you can't reconstruct the original
and you are left with an error like this for bit types:

UPDATE bittest SET a = '10'::varbit;
ERROR: bit string length 2 does not match type bit(3)

For varbit strings you can store the new value, but then later operations
may fail:

CREATE TEMP TABLE varbittest(a varbit(3), b varbit(3));
INSERT INTO varbittest VALUES ('010', '111');
SELECT a & b FROM varbittest;
UPDATE varbittest SET a = '10'::varbit;
SELECT a & b FROM varbittest;
ERROR: cannot AND bit strings of different sizes

So if the problem was that tons of people on the Java side had BitSets
they wanted to put into the database and retrieve, this might be
tolerable. That's not our problem though, we're really looking at the
database side of things and trying to represent that in Java, so we need
another approach. At the moment, returing a String seems easiest, do
other people have ideas?

Kris Jurka


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Kris Jurka" <books(at)ejurka(dot)com>, "Thomas Kellerer" <spam_eater(at)gmx(dot)net>
Cc: <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: bit datatype and getObject()
Date: 2010-12-22 19:35:31
Message-ID: 4D11FEA30200002500038998@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Kris Jurka <books(at)ejurka(dot)com> wrote:

> returing a String seems easiest, do other people have ideas?

What would the Java String look like for SQL B'101' ?

Would bool[] make any sense for an object type?

I'm not sure byte[] would be out of the question, but it would need
zero padding on one side or the other, and you would lose the exact
length. :-(

What do other drivers do? Is there any guidance in the JDBC
standard?

-Kevin


From: Kris Jurka <books(at)ejurka(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Thomas Kellerer <spam_eater(at)gmx(dot)net>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: bit datatype and getObject()
Date: 2010-12-22 20:13:37
Message-ID: alpine.BSO.2.00.1012221442490.19025@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 22 Dec 2010, Kevin Grittner wrote:

> What would the Java String look like for SQL B'101' ?

"101"

> Would bool[] make any sense for an object type?

Conceivably.

> I'm not sure byte[] would be out of the question, but it would need
> zero padding on one side or the other, and you would lose the exact
> length. :-(

boolean[] seems better to me than byte[]. One of the appeals of a
distinct/unique type like BitSet is that you can then implement
PreparedStatement.setObject and have it implement the reverse transform to
pass the data to the server with type safety that you wouldn't get using a
String. At the moment we don't do anything intelligent with native array
types, but if we did we wouldn't know if boolean[] should turn into varbit
or bit(1)[].

> What do other drivers do? Is there any guidance in the JDBC
> standard?

The standard says nothing useful. Section 8.3.3 of this document implies
that as an application developer you should just pretend multi-bit strings
don't exist...

http://download.oracle.com/javase/1.5.0/docs/guide/jdbc/getstart/mapping.html

Mysql's documentation shows it using byte[].

http://dev.mysql.com/doc/refman/5.5/en/connector-j-reference-type-conversions.html

Oracle doesn't look like it supports it.

I think IBM uses byte[], I'm not familiar with DB2 data types to
understand what's the equivalent of bytea and what's varbi, but they both
seem to be mapped to byte[].

http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.apdv.java.doc/doc/rjvjdata.htm

Kris Jurka


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Kris Jurka" <books(at)ejurka(dot)com>
Cc: "Thomas Kellerer" <spam_eater(at)gmx(dot)net>,<pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: bit datatype and getObject()
Date: 2010-12-22 20:32:26
Message-ID: 4D120BFA02000025000389FA@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Kris Jurka <books(at)ejurka(dot)com> wrote:

> The standard says nothing useful. Section 8.3.3 of this document
> implies that as an application developer you should just pretend
> multi-bit strings don't exist...

BitSet is tempting, but the fuzzy definition of size seems to be a
killer. We clearly don't want to use length() instead of size().
Unless someone can make a convincing argument to the contrary, I'm
inclined to agree that we should just us a Java String of '0' and
'1' characters.

-Kevin