Quick Links

Re: Locale + encoding combinations

From:	Dave Page <dpage(at)postgresql(dot)org>
To:	Trevor Talbot <quension(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Locale + encoding combinations
Date:	2007-10-12 14:26:00
Message-ID:	470F83F8.5020503@postgresql.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Trevor Talbot wrote:
> The encoding output is the one you specified.

OK.

> Keep in mind,
> underneath Windows is mostly working with Unicode, so all characters
> exist and the locale rules specify their behavior there. The encoding
> is just the byte stream it needs to force them all into after doing
> whatever it does to them. As you've seen, it uses some sort of
> best-fit mapping I don't know the details of. (It will drop accent
> marks and choose characters with similar shape where possible, by
> default.)

Right, that makes sense. The codepages used by setlocale etc. are just
translation tables to/from the internal unicode representation.

> I think it's a bit more complex for input/transform cases where you
> operate on the byte stream directly without intermediate conversion to
> Unicode, which is why UTF-8 doesn't work as a codepage, but again I
> don't have the details nearby. I can try to do more digging if
> needed.

It does (sort of) work as a codepage, it just doesn't have the NLS file
to define how things like UPPER() and LOWER() should work.

Regards, Dave

In response to

Re: Locale + encoding combinations at 2007-10-12 13:03:52 from Trevor Talbot

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Gregory Stark	2007-10-12 14:28:26	Re: Locales and Encodings
Previous Message	Tom Lane	2007-10-12 14:19:57	Re: First steps with 8.3 and autovacuum launcher