Re: getting 'order by' working with unicode locale? ICU?

Lists: pgsql-hackers
From: Palle Girgensohn <girgen(at)pingpong(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: getting 'order by' working with unicode locale? ICU?
Date: 2004-12-16 02:21:05
Message-ID: 532643C89813E7DEFCA754D5@palle.girgensohn.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

I'm using Postgresql on FreeBSD, and would like to get "order by" to work
with unicode. The OS does have collation implemented for unicode (UTF-8)
locales. Some freebsd people point me towards IBM:s ICU kit.

How much effort would be required to get postgresql to sort properly,
mainly using the sv_SE.UTF-8 locale (so the problem is not *that* hard, I
don't need to sort Chinese [yet] :). What needs to be done to get
postgresql to use ICU (or some other working mechanism?)

Thanks,
Palle


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Palle Girgensohn <girgen(at)pingpong(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: getting 'order by' working with unicode locale? ICU?
Date: 2004-12-16 04:21:13
Message-ID: 19511.1103170873@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Palle Girgensohn <girgen(at)pingpong(dot)net> writes:
> I'm using Postgresql on FreeBSD, and would like to get "order by" to work
> with unicode.

What makes you think it doesn't? Use the right locale and you're set.

regards, tom lane


From: Palle Girgensohn <girgen(at)pingpong(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: getting 'order by' working with unicode locale? ICU?
Date: 2004-12-16 07:57:14
Message-ID: F9790B74A5A24DE3CA276F5F@palle.girgensohn.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

--On onsdag, december 15, 2004 23.21.13 -0500 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
wrote:

> Palle Girgensohn <girgen(at)pingpong(dot)net> writes:
>> I'm using Postgresql on FreeBSD, and would like to get "order by" to
>> work with unicode.
>
> What makes you think it doesn't? Use the right locale and you're set.

Not on FreeBSD, since collation is not implemented in unicode locales. One
way would be to implement it in the OS, of course...

/Palle


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Palle Girgensohn <girgen(at)pingpong(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: getting 'order by' working with unicode locale? ICU?
Date: 2004-12-16 08:20:50
Message-ID: 200412160920.50474.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Palle Girgensohn wrote:
> Not on FreeBSD, since collation is not implemented in unicode
> locales. One way would be to implement it in the OS, of course...

Try taking the locale definition files from another system and use
localedef to build locale files for your local system. The localedef
source files are supposed to be portable.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


From: Palle Girgensohn <girgen(at)pingpong(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: getting 'order by' working with unicode locale? ICU?
Date: 2004-12-18 01:41:35
Message-ID: 93E9A08CD90D8DFF6E9E2FB8@palle.girgensohn.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

--On torsdag, december 16, 2004 09.20.50 +0100 Peter Eisentraut
<peter_e(at)gmx(dot)net> wrote:

> Palle Girgensohn wrote:
>> Not on FreeBSD, since collation is not implemented in unicode
>> locales. One way would be to implement it in the OS, of course...
>
> Try taking the locale definition files from another system and use
> localedef to build locale files for your local system. The localedef
> source files are supposed to be portable.

As far as I understand, there is no code in FreeBSD to specify the
collating order for multibyte locales. Would ot be easier to fix the OS or
hack ICU into PostgreSQL?

A bit off topic: I'm still dreaming of a way to get "order by" working with
different locales for the same database (different clients getting
different collation depending on their locale choice). Now this is
hardcoded at initdb time. Is there any way this could work, ever, in
PostgreSQL, or will I have to sort client side?

Regards,
Palle


From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Palle Girgensohn <girgen(at)pingpong(dot)net>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: getting 'order by' working with unicode locale? ICU?
Date: 2004-12-22 09:03:11
Message-ID: 1103706191.12271.3.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Ühel kenal päeval (laupäev, 18. detsember 2004, 02:41+0100), kirjutas
Palle Girgensohn:
>
> --On torsdag, december 16, 2004 09.20.50 +0100 Peter Eisentraut
> <peter_e(at)gmx(dot)net> wrote:
>
> > Palle Girgensohn wrote:
> >> Not on FreeBSD, since collation is not implemented in unicode
> >> locales. One way would be to implement it in the OS, of course...
> >
> > Try taking the locale definition files from another system and use
> > localedef to build locale files for your local system. The localedef
> > source files are supposed to be portable.
>
> As far as I understand, there is no code in FreeBSD to specify the
> collating order for multibyte locales. Would ot be easier to fix the OS or
> hack ICU into PostgreSQL?
>
> A bit off topic: I'm still dreaming of a way to get "order by" working with
> different locales for the same database (different clients getting
> different collation depending on their locale choice). Now this is
> hardcoded at initdb time. Is there any way this could work, ever, in
> PostgreSQL, or will I have to sort client side?

I guess you can write a function that returns something client-specific
and sort on that.

select weirdnames
from namelist
order by localesort(weirdnames, 'SE');

You can even build and index on localesort(weirdnames, 'SE') to speed
things up for some queries.

And yes, I think using ICU is the right way to do it ;)

------------------
Hannu