Re: Charset problem on WHERE clause

Lists: pgsql-jdbc
From: smota <samuelmota(at)gmail(dot)com>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Charset problem on WHERE clause
Date: 2004-07-26 14:20:40
Message-ID: a8bb739d04072607204610c2ce@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Hi,

I'm pretty new to PostgreSQL as well to it's JDBC driver.

For now I'm using PostgreSQL 7.3.6 version under Red Hat ES 3.0.
The database is created with SQL_ASCII encoding.
I'm retriving data from the database with the pg74.214.jdbc3.jar driver.

Some fields contains values with accents (characters like Ç, Ã, Õ, etc.) ...

I've set the connection string with
jdbc:postgresql://10.100.1.11:5432/mydatabase?charSet=LATIN1

On java code I must get the fields with new
String(result.getBytes(1),"ISO-8859-1") to have accentued chars
correctly displayed .....

My problem is when I get an accentued character on a WHERE expression,
it doesn't return any value.
I've tried field IN (to_char('MANUTENÇÃO', 'LATIN1')) ... but with no success.

Any idea or help on this?

On time ... using pgAdminIII or the line command psql tool both works
with the accents on WHERE clause.

Thanks


From: Kris Jurka <books(at)ejurka(dot)com>
To: smota <samuelmota(at)gmail(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Charset problem on WHERE clause
Date: 2004-07-26 17:09:09
Message-ID: Pine.BSO.4.56.0407261205440.7379@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Mon, 26 Jul 2004, smota wrote:

> Hi,
>
> I'm pretty new to PostgreSQL as well to it's JDBC driver.
>
> For now I'm using PostgreSQL 7.3.6 version under Red Hat ES 3.0.
> The database is created with SQL_ASCII encoding.
> I'm retriving data from the database with the pg74.214.jdbc3.jar driver.
>
> jdbc:postgresql://10.100.1.11:5432/mydatabase?charSet=LATIN1

You should not use a SQL_ASCII database. The JDBC driver requires you
database to use a proper encoding for your data. The ?charSet url
parameter was designed to work around this problem for <= 7.2 servers
which didn't come with multibyte encoding support compiled by default, but
it is ignored in => 7.3 servers so it is useless here.

Kris Jurka


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Charset problem on WHERE clause
Date: 2004-07-26 21:09:25
Message-ID: 41057305.8070700@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Kris Jurka wrote:

> You should not use a SQL_ASCII database. The JDBC driver requires you
> database to use a proper encoding for your data. The ?charSet url
> parameter was designed to work around this problem for <= 7.2 servers
> which didn't come with multibyte encoding support compiled by default, but
> it is ignored in => 7.3 servers so it is useless here.

I wonder if it's worth supporting the charSet parameter even for >= 7.3:
set client_encoding explicitly to SQL_ASCII (which I believe means "no
translation") and do the translation to Unicode on the JVM side using
whatever charset the user provided. I think most of the encoding details
are now isolated from the rest of the protocol logic, so it wouldn't be
a very invasive change.

My only concern is that it'd encourage people to keep their DBs as
SQL_ASCII .. which is just delaying the problem.

-O


From: smota <samuelmota(at)gmail(dot)com>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Charset problem on WHERE clause
Date: 2004-07-26 21:30:33
Message-ID: a8bb739d040726143019540e1b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Hi,

Thanks Kris for your fast response .... I've recreated the database
using LATIN1 and everything is working just fine.

> I wonder if it's worth supporting the charSet parameter even for >= 7.3:
> set client_encoding explicitly to SQL_ASCII (which I believe means "no
> translation") and do the translation to Unicode on the JVM side using
> whatever charset the user provided. I think most of the encoding details
> are now isolated from the rest of the protocol logic, so it wouldn't be
> a very invasive change.

I really don't see this as a good idea :O ....

With the LATIN1 charset I didn't use Java conversion (with new
String(rs.getBytes(1)....etc. etc.) my code was cleaner and easier,
so, It was a good thing make me use the correctly encoded database.

:)

Thanks