From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: plpython_unicode test (was Re: buildfarm / handling (undefined) locales) |
Date: | 2014-06-02 15:59:26 |
Message-ID: | 13681.1401724766@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 06/01/2014 05:35 PM, Tom Lane wrote:
>> I did a little bit of experimentation and determined that none of the
>> LATIN1 characters are significantly more portable than what we've got:
>> for instance a-acute fails to convert into 16 of the 33 supported
>> server-side encodings (versus 17 failures for U+0080). However,
>> non-breaking space is significantly better: it converts into all our
>> supported server encodings except EUC_CN, EUC_JP, EUC_KR, EUC_TW.
>> It seems likely that we won't do better than that except with a basic
>> ASCII character.
> Yeah, I just looked at the copyright symbol, with similar results.
I'd been hopeful about that one too, but nope :-(
> Let's just stick to ASCII.
The more I think about it, the more I think that using a plain-ASCII
character would defeat most of the purpose of the test. Non-breaking
space seems like the best bet here, not least because it has several
different representations among the encodings we support.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2014-06-02 16:03:25 | Re: recovery testing for beta |
Previous Message | Tom Lane | 2014-06-02 15:42:19 | Re: Allowing join removals for more join types |