Re: Encoding from CopyManager.copyIn()

Lists: pgsql-jdbc
From: Markus Kickmaier <markus(dot)kickmaier(at)apus(dot)co(dot)at>
To: Daniel Migowski <dmigowski(at)ikoffice(dot)de>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Encoding from CopyManager.copyIn()
Date: 2009-07-27 08:23:47
Message-ID: 5637857.99741248683027131.JavaMail.root@donald.apus.co.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Thanks for the Responses Daniel and Kris,

but i just don't get it work. I know now what exactly my problem is.
I have a SQL_ASCCI encoded database. The JDBC driver uses UNICODE as client_encoding. So if i want to copy an 'umlaut' like ü into a table i get the error: invalid byte sequence for UTF8...

If i test this in pgAdmin it is the same. But if i set client_encoding to 'SQL_ASCII' in pgAdmin it works fine.
Trying this for my JDBC connection i get a PSQL Exception saying that the client_encoding parameter was changed to SQL_ASCII and the JDBC driver just works correctly with UNICODE.

Any ideas? I'm rather sure it would work if JDBC would let me use SQL_ASCII.

BR, Markus

----- "Daniel Migowski" <dmigowski(at)ikoffice(dot)de> schrieb:

> Or, in your case you have to wrap the OutputStream in an
> OutputStreamWriter (which has the encoding parameter in the
> constructor).
>
> best
> Daniel
>
> Markus Kickmaier schrieb:
> > Hello,
> >
> > I'm using the copyIn() function of the CopyManager. It works fine
> until I don't use an "umlaut" like ü. Then i get an PSQLException:
> >
> > org.postgresql.util.PSQLException: ERROR: invalid byte sequence for
> encoding "UTF8": 0xfc
> >
> > My code looks like follows:
> >
> > ByteArrayOutputStream output = new ByteArrayOutputStream();
> > PrintWriter writer = new PrintWriter(output);
> > writer.println("abcüäö");
> > writer.flush();
> > ByteArrayInputStream input = new
> ByteArrayInputStream(output.toByteArray());
> > long result = ((PGConnection) con_).getCopyAPI().copyIn(statement,
> input);
> >
> > After searching at google i found out that this is an encoding
> problem. The database doesn't know what charset I'm using.
> >
> > Any suggestion how i can specify the encoding i want to use?
> >
> > BR, Markus
> >
> >


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Markus Kickmaier <markus(dot)kickmaier(at)apus(dot)co(dot)at>
Cc: Daniel Migowski <dmigowski(at)ikoffice(dot)de>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Encoding from CopyManager.copyIn()
Date: 2009-07-28 05:40:41
Message-ID: 4A6E8F59.7080004@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Markus Kickmaier wrote:
> Thanks for the Responses Daniel and Kris,
>
> but i just don't get it work. I know now what exactly my problem is.
> I have a SQL_ASCCI encoded database. The JDBC driver uses UNICODE as client_encoding. So if i want to copy an 'umlaut' like ü into a table i get the error: invalid byte sequence for UTF8...
>
> If i test this in pgAdmin it is the same. But if i set client_encoding to 'SQL_ASCII' in pgAdmin it works fine.
> Trying this for my JDBC connection i get a PSQL Exception saying that the client_encoding parameter was changed to SQL_ASCII and the JDBC driver just works correctly with UNICODE.
>
> Any ideas? I'm rather sure it would work if JDBC would let me use SQL_ASCII.

You should convert your database to an appropriate encoding for the data
it contains (perhaps LATIN1?). If the database encoding is SQL_ASCII,
the JDBC driver has no way of knowing how to convert bytes >127 to
Java's UTF-16 String representation.

Basically, SQL_ASCII is only going to work with the JDBC driver if you
only store 7-bit ASCII, or if you happen to be very lucky and have all
clients everywhere use a client_encoding of UNICODE.

-O


From: Markus Kickmaier <markus(dot)kickmaier(at)apus(dot)co(dot)at>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: Daniel Migowski <dmigowski(at)ikoffice(dot)de>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Encoding from CopyManager.copyIn()
Date: 2009-07-28 12:51:58
Message-ID: 4582381.102821248785518175.JavaMail.root@donald.apus.co.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Hi,

what I've done now is following:

- I converted my database to UTF8.
- I use a OutputStreamWriter with UTF8 as encoding to fill my stream for the copy statement.

Now it works. Thanks for your help.

BR, Markus

----- "Oliver Jowett" <oliver(at)opencloud(dot)com> schrieb:

> Markus Kickmaier wrote:
> > Thanks for the Responses Daniel and Kris,
> >
> > but i just don't get it work. I know now what exactly my problem
> is.
> > I have a SQL_ASCCI encoded database. The JDBC driver uses UNICODE as
> client_encoding. So if i want to copy an 'umlaut' like ü into a table
> i get the error: invalid byte sequence for UTF8...
> >
> > If i test this in pgAdmin it is the same. But if i set
> client_encoding to 'SQL_ASCII' in pgAdmin it works fine.
> > Trying this for my JDBC connection i get a PSQL Exception saying
> that the client_encoding parameter was changed to SQL_ASCII and the
> JDBC driver just works correctly with UNICODE.
> >
> > Any ideas? I'm rather sure it would work if JDBC would let me use
> SQL_ASCII.
>
> You should convert your database to an appropriate encoding for the
> data
> it contains (perhaps LATIN1?). If the database encoding is SQL_ASCII,
>
> the JDBC driver has no way of knowing how to convert bytes >127 to
> Java's UTF-16 String representation.
>
> Basically, SQL_ASCII is only going to work with the JDBC driver if you
>
> only store 7-bit ASCII, or if you happen to be very lucky and have all
>
> clients everywhere use a client_encoding of UNICODE.
>
> -O