locales and encodings on Windows

From: Aleksander Kmetec <aleksander(dot)kmetec(at)intera(dot)si>
To: pgsql-hackers-win32(at)postgresql(dot)org
Subject: locales and encodings on Windows
Date: 2004-11-06 01:26:19
Message-ID: 418C283B.1060905@intera.si
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers-win32

I would like to bring to your attention a problem regarding locale
support on Windows. The description below uses UNICODE/UTF8, but the
issue isn't limited to just this encoding.

Because Postgres relies on the operating system for some string related
functions, the OS needs to support the same encoding as the one that is
used as the database encoding. Unfortunately, Windows does not support
some encodings that are available as server-side encodings for PG.

Here is a short example in case the previous paragraph doesn't make much
sense: with a UNICODE database (actually UTF8) you need to use a
compatible locale when running initdb; in my case that's "sl_SI.utf8"
(on Linux) or "Slovenian_Slovenia.65001" (on Windows).

65001 is Windows codepage number for utf8; except it's not a really a
valid codepage. The document at
http://www.sharmahd.com/tm/codepages.html states that: "65000 (UTF-7)
and 65001 (UTF-8) are pseudo codepages. There are no corresponding NLS
files. The code page IDs can only be used with WideCharToMultiByte( )
and MultiByteToWideChar( ) API calls."

This means that UPPER(), LOWER() and ORDER BY do not work correctly for
unicode databases. Currently it's not even possible to run initdb with
a locale which uses 65001 encoding. A small change to initdb enabled me
to set LC_COLLATE to Slovenian_Slovenia.65001, but the sort order was
still badly messed up, which makes sense considering the above quote.

After some checking I came up with this list of encodings which are
supported by PG, but not mentioned anywhere as supported by Windows:
UTF8
EUC_CN
EUC_TW
LATIN6 (ISO 8859-10/ECMA 144)
LATIN7 (ISO 8859-13)
LATIN8 (ISO 8859-14)
LATIN10 (ISO 8859-16/ASRO SR 14111)

Is there a solution for this, other than marking these encodings as not
available on Windows?

Regards,
Aleksander

Responses

Browse pgsql-hackers-win32 by date

  From Date Subject
Next Message Bruce Momjian 2004-11-06 04:29:44 psql \! WIN32 cleanup
Previous Message Andrew Dunstan 2004-11-05 23:39:02 Re: Isn't win32_make_absolute() a waste of