Re: Charset Win1250 on Windows and Ubuntu

Lists: pgsql-general
From: Durumdara <durumdara(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Charset Win1250 on Windows and Ubuntu
Date: 2009-12-18 12:30:46
Message-ID: 9e384ef60912180430ie63d515g28ddc809a4dedbd7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi!

I have a software that uses Postgresql. This program (and website) developed
and working on Window (XP/2003), with native charset (win1250).

Prior week we got a special request to install this software to a Linux
server.

Yesterday I installed Ubu9.10 on VirtualBox, and tried to moving the
database under Linux.

First big problem is that when I tried to create a database with same
parameters as in Windows, the PGAdmin show an error.
The errormessage is:
"Error: new encoding (Win1250) is incompatible with the encoding of the
template database (UTF8)."

Ok, I changed to "template0".

Then I got error that Win1250 is not good for collation hu_HU.UTF8.

When I tried to insert hungarian chars (to check sort order), the C and
POSIX return wrong result - as I thought before.

The Windows version of PG and Admin is not supports collation, so these two
options are disable (collation, character type).

But in Linux I have only UTF version that can sort rows in good order.

The problem that the client program is win1250 based, and I must rewrite all
things to make same results.

Have anybody some way, some tricky solution for this problem?

Thanks for your help:
dd


From: Adrian Klaver <aklaver(at)comcast(dot)net>
To: pgsql-general(at)postgresql(dot)org
Cc: Durumdara <durumdara(at)gmail(dot)com>
Subject: Re: Charset Win1250 on Windows and Ubuntu
Date: 2009-12-19 20:54:36
Message-ID: 200912191254.36902.aklaver@comcast.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Friday 18 December 2009 4:30:46 am Durumdara wrote:
> Hi!
>
> I have a software that uses Postgresql. This program (and website)
> developed and working on Window (XP/2003), with native charset (win1250).
>
> Prior week we got a special request to install this software to a Linux
> server.
>
> Yesterday I installed Ubu9.10 on VirtualBox, and tried to moving the
> database under Linux.
>
> First big problem is that when I tried to create a database with same
> parameters as in Windows, the PGAdmin show an error.
> The errormessage is:
> "Error: new encoding (Win1250) is incompatible with the encoding of the
> template database (UTF8)."
>
> Ok, I changed to "template0".
>
> Then I got error that Win1250 is not good for collation hu_HU.UTF8.
>
> When I tried to insert hungarian chars (to check sort order), the C and
> POSIX return wrong result - as I thought before.
>
> The Windows version of PG and Admin is not supports collation, so these two
> options are disable (collation, character type).

There is a Linux version of PGAdmin available for Ubuntu 9.10.

>
> But in Linux I have only UTF version that can sort rows in good order.
>
> The problem that the client program is win1250 based, and I must rewrite
> all things to make same results.
>
> Have anybody some way, some tricky solution for this problem?

Use psql and CREATE DATABASE:
http://www.postgresql.org/docs/8.4/interactive/sql-createdatabase.html

>
> Thanks for your help:
> dd

--
Adrian Klaver
aklaver(at)comcast(dot)net


From: Dave Page <dpage(at)pgadmin(dot)org>
To: Adrian Klaver <aklaver(at)comcast(dot)net>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>, Durumdara <durumdara(at)gmail(dot)com>
Subject: Re: Charset Win1250 on Windows and Ubuntu
Date: 2009-12-19 21:04:30
Message-ID: 937d27e10912191304q5aeee4eas4cc2bd8557c5973@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Sat, Dec 19, 2009 at 8:54 PM, Adrian Klaver <aklaver(at)comcast(dot)net> wrote:
>> The Windows version of PG and Admin is not supports collation, so these two
>> options are disable (collation, character type).
>
> There is a Linux version of PGAdmin available for Ubuntu 9.10.

Doesn't matter - pgAdmin supports collation and ctype on all platforms
when creating databases. If the options are disabled, it's because the
OP is running a server older than 8.4.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


From: Adrian Klaver <aklaver(at)comcast(dot)net>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: "pgsql-general" <pgsql-general(at)postgresql(dot)org>, Durumdara <durumdara(at)gmail(dot)com>
Subject: Re: Charset Win1250 on Windows and Ubuntu
Date: 2009-12-19 21:09:08
Message-ID: 200912191309.09133.aklaver@comcast.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Saturday 19 December 2009 1:04:30 pm Dave Page wrote:
> On Sat, Dec 19, 2009 at 8:54 PM, Adrian Klaver <aklaver(at)comcast(dot)net> wrote:
> >> The Windows version of PG and Admin is not supports collation, so these
> >> two options are disable (collation, character type).
> >
> > There is a Linux version of PGAdmin available for Ubuntu 9.10.
>
> Doesn't matter - pgAdmin supports collation and ctype on all platforms
> when creating databases. If the options are disabled, it's because the
> OP is running a server older than 8.4.

That is what I get for assuming. I figured since the OP was using Ubuntu 9.10
they where using the default version of Postgres, 8.4.

--
Adrian Klaver
aklaver(at)comcast(dot)net


From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Durumdara *EXTERN*" <durumdara(at)gmail(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Charset Win1250 on Windows and Ubuntu
Date: 2009-12-19 22:32:04
Message-ID: D960CB61B694CF459DCFB4B0128514C203A8991A@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Durumdara wrote:
> I have a software that uses Postgresql. This program (and website) developed and working on Window (XP/2003),
> with native charset (win1250).
>
> Prior week we got a special request to install this software to a Linux server.
>
> Yesterday I installed Ubu9.10 on VirtualBox, and tried to moving the database under Linux.
>
> First big problem is that when I tried to create a database with same parameters as in Windows, the PGAdmin
> show an error.
> The errormessage is:
> "Error: new encoding (Win1250) is incompatible with the encoding of the template database (UTF8)."
>
> Ok, I changed to "template0".
>
> Then I got error that Win1250 is not good for collation hu_HU.UTF8.
>
> When I tried to insert hungarian chars (to check sort order), the C and POSIX return wrong result - as I thought before.
>
> The Windows version of PG and Admin is not supports collation, so these two options are disable (collation,
> character type).
>
> But in Linux I have only UTF version that can sort rows in good order.
>
> The problem that the client program is win1250 based, and I must rewrite all things to make same results.
>
> Have anybody some way, some tricky solution for this problem?

If the collation ho_HU.UTF8 is what you want (can sort rows in good order), you
should use UTF8 as database encoding.

If you need the data in WIN1250 on the client side, change the client encoding to WIN1250.

So:
- Create the database with UTF8.
- Change the client encoding to WIN1250 (e.g. by setting the environment variable PGCLIENTENCODING).
- Import the dump of the Windows database. It will be converted to UTF-8.
- Make sure that the client program has client encoding WIN1250.

Yours,
Laurenz Albe


From: Durumdara <durumdara(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Charset Win1250 on Windows and Ubuntu
Date: 2009-12-21 09:26:51
Message-ID: 9e384ef60912210126j345fa4f2r86aecac411a3c73c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi!

2009/12/19 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>

> If you need the data in WIN1250 on the client side, change the client
> encoding to WIN1250.
>
> So:
> - Create the database with UTF8.
> - Change the client encoding to WIN1250 (e.g. by setting the environment
> variable PGCLIENTENCODING).
> - Import the dump of the Windows database. It will be converted to UTF-8.
> - Make sure that the client program has client encoding WIN1250.
>
> Yours,
> Laurenz Albe
>

So if I have Python and pygresql, can I set this value in Python?
The main problem that I don't want to set this value globally - possible
another applications want to use another encoding...

Thanks for your help:
dd


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Durumdara <durumdara(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Charset Win1250 on Windows and Ubuntu
Date: 2009-12-21 09:41:11
Message-ID: 20091221094111.GA28076@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Mon, Dec 21, 2009 at 10:26:51AM +0100, Durumdara wrote:
> So if I have Python and pygresql, can I set this value in Python?
> The main problem that I don't want to set this value globally - possible
> another applications want to use another encoding...

Each connection can set the encoding to whatever they like. Something I
find useful is to setup the DB as UTF-8 but then do:

ALTER DATABASE foo SET client_encoding = latin9;

which sets the default for the DB, or

ALTER USER bar SET client_encoding = latin9;

Which lets you set the defauts for each user. This means that old
scripts can work unchanged but newer scripts can choose UTF-8 if they
want it.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.


From: Alban Hertroys <dalroi(at)solfertje(dot)student(dot)utwente(dot)nl>
To: Durumdara <durumdara(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Charset Win1250 on Windows and Ubuntu
Date: 2009-12-21 10:45:10
Message-ID: 968C401E-0008-48E1-A233-A391AFA01CE4@solfertje.student.utwente.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 21 Dec 2009, at 10:26, Durumdara wrote:

> So if I have Python and pygresql, can I set this value in Python?
> The main problem that I don't want to set this value globally - possible another applications want to use another encoding....

Sure you can, just execute SET client_encoding TO 'WIN1250' once you've set up your connection. You can even do that between queries if your client encoding requirements change between queries.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.

!DSPAM:737,4b2f51b9228057414011521!


From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Durumdara *EXTERN*" <durumdara(at)gmail(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Charset Win1250 on Windows and Ubuntu
Date: 2009-12-21 12:52:22
Message-ID: D960CB61B694CF459DCFB4B0128514C2039380E4@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Durumdara wrote:
>> - Change the client encoding to WIN1250 (e.g. by
>> setting the environment variable PGCLIENTENCODING).
>
> So if I have Python and pygresql, can I set this value in Python?
> The main problem that I don't want to set this value globally
> - possible another applications want to use another encoding...

There may be special Python functions, but you can use the following
SQL statement: SET client_encoding TO 'WIN1250'

Yours,
Laurenz Albe


From: Durumdara <durumdara(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Charset Win1250 on Windows and Ubuntu
Date: 2009-12-21 14:24:19
Message-ID: 9e384ef60912210624j25ae9948h355708ad3156152d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi!

2009/12/21 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>

> Durumdara wrote:
> >> - Change the client encoding to WIN1250 (e.g. by
> >> setting the environment variable PGCLIENTENCODING).
> >
> > So if I have Python and pygresql, can I set this value in Python?
> > The main problem that I don't want to set this value globally
> > - possible another applications want to use another encoding...
>
> There may be special Python functions, but you can use the following
> SQL statement: SET client_encoding TO 'WIN1250'
>

And what happening what DB recognize not win1250 character in SQL?
Is it converted to "?" or an exception dropped?
And if the UTF db contains non win1250 character?
Is it replaced in result with "?" or some exception dropped?

Thanks:
dd


From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Durumdara *EXTERN*" <durumdara(at)gmail(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Charset Win1250 on Windows and Ubuntu
Date: 2009-12-22 09:58:52
Message-ID: D960CB61B694CF459DCFB4B0128514C2039380EC@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Durumdara wrote:
[client_encoding is switched to WIN1250]
> And what happening what DB recognize not win1250 character in SQL?
> Is it converted to "?" or an exception dropped?
> And if the UTF db contains non win1250 character?
> Is it replaced in result with "?" or some exception dropped?

What you wrote is very confusing/confused; this is problably a
language problem.

I'll try to reformulate your questions and answer them; if I
got something wrong, please tell me.

Q: What happens if your SQL statement contains a character that is not WIN1250 encoded?
Is it converted to "?" or do you get an error?

A: You get an error (this is not Oracle). Here an example for hex 88:
ERROR: character 0x88 of encoding "WIN1250" has no equivalent in "UTF8"
Since every known character is representable in UTF-8, that means
that this is an invalid byte.

Q: What happens if you select a character in the UTF8 database that cannot be
converted to WIN1250?

A: You will also get an error. Here is what you get for selecting a "G clef":
ERROR: character 0xf09d849e of encoding "UTF8" has no equivalent in "WIN1250"

Yours,
Laurenz Albe