From: | Dave Page <dpage(at)postgresql(dot)org> |
---|---|
To: | Trevor Talbot <quension(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Locale + encoding combinations |
Date: | 2007-10-12 14:26:00 |
Message-ID: | 470F83F8.5020503@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Trevor Talbot wrote:
> The encoding output is the one you specified.
OK.
> Keep in mind,
> underneath Windows is mostly working with Unicode, so all characters
> exist and the locale rules specify their behavior there. The encoding
> is just the byte stream it needs to force them all into after doing
> whatever it does to them. As you've seen, it uses some sort of
> best-fit mapping I don't know the details of. (It will drop accent
> marks and choose characters with similar shape where possible, by
> default.)
Right, that makes sense. The codepages used by setlocale etc. are just
translation tables to/from the internal unicode representation.
> I think it's a bit more complex for input/transform cases where you
> operate on the byte stream directly without intermediate conversion to
> Unicode, which is why UTF-8 doesn't work as a codepage, but again I
> don't have the details nearby. I can try to do more digging if
> needed.
It does (sort of) work as a codepage, it just doesn't have the NLS file
to define how things like UPPER() and LOWER() should work.
Regards, Dave
From | Date | Subject | |
---|---|---|---|
Next Message | Gregory Stark | 2007-10-12 14:28:26 | Re: Locales and Encodings |
Previous Message | Tom Lane | 2007-10-12 14:19:57 | Re: First steps with 8.3 and autovacuum launcher |