Re: Encoding and i18n

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Gregory Stark <stark(at)enterprisedb(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Encoding and i18n
Date: 2007-10-06 18:24:28
Message-ID: 27167.1191695068@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> I tried on both a UTF8 and Latin1 terminal and it works OK in all cases.

The cases that would be interesting involve to_char's locale-specific
format codes (eg Dy) along with LC_TIME settings that are deliberately
incompatible with the database encoding. client_encoding is not relevant.

It's not real clear to me whether, on a Unix machine, there is even
supposed to be any difference between setting LC_TIME=es_ES.iso88591 and
setting it to es_ES.utf8. Since nl_langinfo(CODESET) is supposedly
determined only by LC_CTYPE, you could argue that strftime's results
should be in that encoding regardless, and that the codeset component of
other LC_ variables should be ignored. Some experimentation suggests
that at least in glibc it doesn't work that way, and that there is in
fact no principled way for you to find out what encoding strftime is
giving you :-(.

$ LANG=es_ES.utf8 date
sb oct 6 14:11:30 EDT 2007
$ LANG=es_ES.iso88591 date
sb oct 6 14:11:42 EDT 2007
$ LANG=en_US.iso88591 LC_TIME=es_ES.utf8 date
sb oct 6 14:12:10 EDT 2007
$ LC_CTYPE=en_US.iso88591 LC_TIME=es_ES.utf8 date
sb oct 6 14:12:34 EDT 2007

Perhaps a workable fix for this would be to try to mangle the LC_ settings
we pass to setlocale() so that they all have the same codeset component
(if any). It looks like the convention of ".foo" being a codeset name
is fairly well standardized, even if the spelling of the codeset name is
not ...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2007-10-07 00:26:04 Re: Encoding and i18n
Previous Message Stephan Szabo 2007-10-06 18:19:47 Re: Polymorphic arguments and composite types