Lists: | pgsql-general |
---|
From: | Paul Gaspar <devlist(at)revolversoft(dot)com> |
---|---|
To: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Collation in ORDER BY not lexicographical |
Date: | 2009-09-29 08:52:02 |
Message-ID: | 20BC5D30-5677-4E0F-A9EB-A05DBC2CE866@revolversoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
Hi!
We have big problems with collation in ORDER BY, which happens in
binary order, not alphabetic (lexicographical), like:.
A
B
Z
a
z
Ä
Ö
ä
ö
This is what we have done:
1. initdb -U myuser -E UTF-8 --locale=de_DE.UTF-8 -D /mydata
2. pg_ctl -U myuser -D /mydata -l logfile start
3. createdb -U myuser mydb
4. create table mytable(name text)
5. insert into mytable values('Adam'); ....
6. select * from mytable order by name
The result is:
"Adam"
"Berta"
"Mann"
"Zoo"
"Zoodirektor"
"adam"
"Äpfel"
"Öl"
"äpfel"
"locale -a" shows:
de_DE
de_DE.ISO8859-1
de_DE.ISO8859-15
de_DE.UTF-8
"psql -l" shows:
List of databases
Name | Owner | Encoding | Collation | Ctype
| Access privileges
----------+-----------+----------+-------------+-------------
+-----------------------------------
postgres | myuser | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 |
template0 | myuser | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 | =c/myuser
: myuser =CTc/myuser
template1 | myuser | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 | =c/myuser
:
myuser =CTc/myuser
mydb | myuser | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 |
PG is running on Mac OS X 10.5 and 10.6 Intel.
Any help is appreciated. Thanks very much in advance.
Paul
From: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> |
---|---|
To: | Paul Gaspar <devlist(at)revolversoft(dot)com> |
Cc: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Collation in ORDER BY not lexicographical |
Date: | 2009-09-29 09:21:24 |
Message-ID: | dcc563d10909290221p72dc2b08m6398c2cd5c20dcc9@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
On Tue, Sep 29, 2009 at 2:52 AM, Paul Gaspar <devlist(at)revolversoft(dot)com> wrote:
> Hi!
>
> We have big problems with collation in ORDER BY, which happens in binary
> order, not alphabetic (lexicographical), like:.
>
> A
> B
> Z
> a
> z
> Ä
> Ö
> ä
> ö
>
> PG is running on Mac OS X 10.5 and 10.6 Intel.
I seem to recall there were some problem with Mac locales at some
point being broken. Could be you're running into that issue.
From: | Maximilian Tyrtania <maximilian(dot)tyrtania(at)onlinehome(dot)de> |
---|---|
To: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, Paul Gaspar <devlist(at)revolversoft(dot)com> |
Cc: | 'PostgreSQL pg-general List' <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Collation in ORDER BY not lexicographical |
Date: | 2009-09-29 12:36:57 |
Message-ID: | C6E7CC09.3B8A7%maximilian.tyrtania@onlinehome.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
am 29.09.2009 11:21 Uhr schrieb Scott Marlowe unter scott(dot)marlowe(at)gmail(dot)com:
> On Tue, Sep 29, 2009 at 2:52 AM, Paul Gaspar <devlist(at)revolversoft(dot)com> wrote:
>> Hi!
>>
>> We have big problems with collation in ORDER BY, which happens in binary
>> order, not alphabetic (lexicographical), like:.
>>
>> A
>> B
>> Z
>> a
>> z
>> Ä
>> Ö
>> ä
>> ö
>>
>
>> PG is running on Mac OS X 10.5 and 10.6 Intel.
>
> I seem to recall there were some problem with Mac locales at some
> point being broken. Could be you're running into that issue.
Yep, i ran into this as well. Here is my workaround: Create a function like
this:
CREATE OR REPLACE FUNCTION f_getorderbyfriendlyversion(texttoconvert text)
RETURNS text AS
$BODY$
select
replace(replace(replace(replace(replace(replace($1,'Ä','A'),'Ö','O'),'Ü','U'
),'ä','a'),'ö','o'),'ü','u');
$BODY$
LANGUAGE 'sql' IMMUTABLE STRICT
COST 100;
ALTER FUNCTION f_getorderbyfriendlyversion(text) OWNER TO postgres;
Then create an index like this:
create index idx_personen_nachname_orderByFriendly on personen
(f_getorderbyfriendlyversion(nachname))
Now you can do:
select * from personen order by f_getorderbyfriendlyversion(p.nachname)
Seems pretty fast.
Best,
Maximilian Tyrtania
From: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
---|---|
To: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> |
Cc: | Paul Gaspar <devlist(at)revolversoft(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Collation in ORDER BY not lexicographical |
Date: | 2009-09-30 09:24:16 |
Message-ID: | 1254302656.10072.3.camel@fsopti579.F-Secure.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
On Tue, 2009-09-29 at 03:21 -0600, Scott Marlowe wrote:
> On Tue, Sep 29, 2009 at 2:52 AM, Paul Gaspar <devlist(at)revolversoft(dot)com> wrote:
> > Hi!
> >
> > We have big problems with collation in ORDER BY, which happens in binary
> > order, not alphabetic (lexicographical), like:.
> >
> > A
> > B
> > Z
> > a
> > z
> > Ä
> > Ö
> > ä
> > ö
> >
>
> > PG is running on Mac OS X 10.5 and 10.6 Intel.
>
> I seem to recall there were some problem with Mac locales at some
> point being broken. Could be you're running into that issue.
Yes, the UTF8 locales on BSD systems (Mac OS X, FreeBSD, etc.) are
dysfunctional. Either switch to a non-UTF8 locale or a different
operating system.
From: | Paul Gaspar <devlist(at)revolversoft(dot)com> |
---|---|
To: | PostgreSQL pg-general List <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Collation in ORDER BY not lexicographical |
Date: | 2009-10-01 15:46:59 |
Message-ID: | EB5DD4FC-76D0-48CC-BB83-B09F1F252AAD@revolversoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-general |
Thank you all very much for your help.
Maximilian, we simplified your replacing code:
> replace(replace(replace(replace(replace(replace
> ($1,'Ä','A'),'Ö','O'),'Ü','U'
> ),'ä','a'),'ö','o'),'ü','u');
to this:
translate(upper($1),'ÄÖÜ','AOU')
Paul
Am 29.09.2009 um 14:36 schrieb Maximilian Tyrtania:
> am 29.09.2009 11:21 Uhr schrieb Scott Marlowe unter scott(dot)marlowe(at)gmail(dot)com
> :
>
>> On Tue, Sep 29, 2009 at 2:52 AM, Paul Gaspar <devlist(at)revolversoft(dot)com
>> > wrote:
>>> Hi!
>>>
>>> We have big problems with collation in ORDER BY, which happens in
>>> binary
>>> order, not alphabetic (lexicographical), like:.
>>
>>> PG is running on Mac OS X 10.5 and 10.6 Intel.
>>
>> I seem to recall there were some problem with Mac locales at some
>> point being broken. Could be you're running into that issue.
>
> Yep, i ran into this as well. Here is my workaround: Create a
> function like
> this:
>
> CREATE OR REPLACE FUNCTION f_getorderbyfriendlyversion(texttoconvert
> text)
>
> RETURNS text AS
> $BODY$
> select
> replace(replace(replace(replace(replace(replace
> ($1,'Ä','A'),'Ö','O'),'Ü','U'
> ),'ä','a'),'ö','o'),'ü','u');
>
> $BODY$
>
> LANGUAGE 'sql' IMMUTABLE STRICT
> COST 100;
>
> ALTER FUNCTION f_getorderbyfriendlyversion(text) OWNER TO postgres;
>
> Then create an index like this:
>
> create index idx_personen_nachname_orderByFriendly on personen
> (f_getorderbyfriendlyversion(nachname))
>
>
> Now you can do:
>
> select * from personen order by f_getorderbyfriendlyversion
> (p.nachname)
>
> Seems pretty fast.
>
> Best,
>
> Maximilian Tyrtania