Re: Set LC_COLLATE to de_DE_phoneb

Lists: pgsql-hackers
From: Frank Jagusch <frank(at)jagusch-online(dot)de>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Set LC_COLLATE to de_DE_phoneb
Date: 2010-04-06 20:12:49
Message-ID: 1270584769.6242.31.camel@AMD3800.zuhause
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

How to set the collation of a database to the german phone book sort
order?
I did ask this on several places. Finally the moderator of pg-forum.de
recommended to ask here. See the discussion there:
http://www.pg-forum.de/konfiguration/4308-sortierfolge-de_de_phoneb.html

Environment: PostgreSQL 8.4.3 build 1400, 32 Bit, Windows XP

The usual german collation is "German_Germany.1252". This corresponds to
the windows language setting de_DE an in the registry (HKEY_CURRENT_USER
\control\Panel\International\Locale) to the value 00000407.

The german phone book order has the windows language setting
de_DE_phoneb an the value 00010407 in the registry. Unfortunately I was
not able to find a corresponding string for the LC_COLLATE setting.

I searched the sources of postgresql an found the function
IsoLocaleName(...) in src/backend/utils/adt/pg_locale.c. I guess this
should be the place for further investigations. Or am I wrong? I'm not
quite well in C and without some knowledge of the libraries behind I
make no progress. Can anyone help me out? Is there anywhere a
documentation or a "translation table" for the different representations
of the language settings between the postgresql- and the
windows-"world"?

Background: I moved an old application from a borland paradox database
to postgesql. The speed gain is great but the sorting order isn't the
usual to the user. I can't change the order by clauses of the select
statements because they are generated by the borland database engine.

Thanks in advance
Frank Jagusch
--
http://www.jagusch-online.de/cdlfj


From: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Frank Jagusch <frank(at)jagusch-online(dot)de>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Set LC_COLLATE to de_DE_phoneb
Date: 2010-04-07 07:56:12
Message-ID: 20100407165611.FCBF.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Frank Jagusch <frank(at)jagusch-online(dot)de> wrote:

> The german phone book order has the windows language setting
> de_DE_phoneb an the value 00010407 in the registry. Unfortunately I was
> not able to find a corresponding string for the LC_COLLATE setting.

I cannot find any resources for "de_DE_phoneb" in Web. What is the true
name for it? Locale names should be in <Country>_<Language>.<CodePage>
format on Windows. If you can find the counterpart name for it, you can
initialize PostgreSQL DB with the locale, and <CodePage> or UTF-8 encoding.

> Background: I moved an old application from a borland paradox database
> to postgesql. The speed gain is great but the sorting order isn't the
> usual to the user. I can't change the order by clauses of the select
> statements because they are generated by the borland database engine.

I'm afraid of "de_DE_phoneb" is an original locale implementation in your
old database. If so, PostgreSQL cannot support it because postgres depends
on locale libraries in each platform. (i.e., msvcrt on Windows)

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center


From: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Frank Jagusch <frank(at)jagusch-online(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Set LC_COLLATE to de_DE_phoneb
Date: 2010-04-08 01:27:21
Message-ID: 20100408102721.95EF.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Frank Jagusch <frank(at)jagusch-online(dot)de> wrote:

> "de_DE_phoneb" is the name of an alternative sorting in german (only a
> few languages have alternate sorting). You may find some information
> when you search the MSDN for "de_DE_phoneb", i.e.
> http://msdn.microsoft.com/en-en/library/ms404373.aspx
> These alternate sorting is supported by the OS, but I don't know how it
> is supported in the msvcrt.

Hmmm, I found "de-DE_phoneb" in MSDN:
http://msdn.microsoft.com/en-us/library/dd374060
but setlocale("de-DE_phoneb") always fails at least on my machine.

The doc says "de-DE_phoneb" is a locale name for
MAKELCID(MAKELANGID(LANG_GERMAN, SUBLANG_GERMAN), SORT_GERMAN_PHONE_BOOK).
Some of native Win32 APIs could accept the locale and sort-order
combination, but setlocale() in CRT seems to reject it.

So, you could use the locale if you find a setlocale-compatible name of
"de-DE_phoneb". Or, you cannot use it, unless we modify PostgreSQL to
use Win32 locale functions instead of standard libc ones -- but it is
hardly acceptable.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center


From: Frank Jagusch <frank(at)jagusch-online(dot)de>
To: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Set LC_COLLATE to de_DE_phoneb
Date: 2010-04-08 16:59:44
Message-ID: 1270745984.6423.18.camel@AMD3800.zuhause
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Am Donnerstag, den 08.04.2010, 10:27 +0900 schrieb Takahiro Itagaki:
> Frank Jagusch <frank(at)jagusch-online(dot)de> wrote:
>
> > "de_DE_phoneb" is the name of an alternative sorting in german (only a
> > few languages have alternate sorting). You may find some information
> > when you search the MSDN for "de_DE_phoneb", i.e.
> > http://msdn.microsoft.com/en-en/library/ms404373.aspx
> > These alternate sorting is supported by the OS, but I don't know how it
> > is supported in the msvcrt.
>
> Hmmm, I found "de-DE_phoneb" in MSDN:
> http://msdn.microsoft.com/en-us/library/dd374060
> but setlocale("de-DE_phoneb") always fails at least on my machine.

Is it a windows box? May be you need to install some german language
support?

> The doc says "de-DE_phoneb" is a locale name for
> MAKELCID(MAKELANGID(LANG_GERMAN, SUBLANG_GERMAN), SORT_GERMAN_PHONE_BOOK).
> Some of native Win32 APIs could accept the locale and sort-order
> combination, but setlocale() in CRT seems to reject it.
>
> So, you could use the locale if you find a setlocale-compatible name of
> "de-DE_phoneb".

As far as I investigated I didn't find one. I hoped to find such a
setlocale-compatible name or a hint where to search here...

> Or, you cannot use it, unless we modify PostgreSQL to
> use Win32 locale functions instead of standard libc ones -- but it is
> hardly acceptable.

I thought PostgreSQL is using OS specific functions for sorting (means
Win32 functions?). This sounds not good for my request. So I ask the
whole audience: Are others out there asking for a support for the
alternate sort orders? Is it worth to discuss further in this direction?

Here an other Idea: Is there a way to define a custom collation for a
database?

Thanks so far,
Frank Jagusch
--
http://www.jagusch-online.de/frank


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Frank Jagusch <frank(at)jagusch-online(dot)de>
Cc: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Set LC_COLLATE to de_DE_phoneb
Date: 2010-04-10 07:20:16
Message-ID: 20100410072015.GA6944@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Apr 08, 2010 at 06:59:44PM +0200, Frank Jagusch wrote:
> I thought PostgreSQL is using OS specific functions for sorting (means
> Win32 functions?). This sounds not good for my request. So I ask the
> whole audience: Are others out there asking for a support for the
> alternate sort orders? Is it worth to discuss further in this direction?
>
> Here an other Idea: Is there a way to define a custom collation for a
> database?

Over the years there have been various options suggested and various
patches posted but they never make it, for various reasons.

The only concerted effort I know of is the PostgreSQL ICU patch which
is in FreeBSD ports, to work around the fact that it's C library
doesn't handle UTF-8 collation at all.

http://people.freebsd.org/~girgen/postgresql-icu/README.html

I doubt it's been tested for Windows, but if it works it will give you
alternate sort orders and even custom sort orders.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
> - Charles de Gaulle