Re: plpython_unicode test (was Re: buildfarm / handling (undefined) locales)

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: plpython_unicode test (was Re: buildfarm / handling (undefined) locales)
Date: 2014-06-01 21:57:57
Message-ID: 538BA1E5.6040406@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 06/01/2014 05:35 PM, Tom Lane wrote:
> I wrote:
>> 3. Try to select some "more portable" non-ASCII character, perhaps U+00A0
>> (non breaking space) or U+00E1 (a-acute). I think this would probably
>> work for most encodings but it might still fail in the Far East. Another
>> objection is that the expected/plpython_unicode.out file would contain
>> that character in UTF8 form. In principle that would work, since the test
>> sets client_encoding = utf8 explicitly, but I'm worried about accidental
>> corruption of the expected file by text editors, file transfers, etc.
>> (The current usage of U+0080 doesn't suffer from this risk because psql
>> special-cases printing of multibyte UTF8 control characters, so that we
>> get exactly "\u0080".)
> I did a little bit of experimentation and determined that none of the
> LATIN1 characters are significantly more portable than what we've got:
> for instance a-acute fails to convert into 16 of the 33 supported
> server-side encodings (versus 17 failures for U+0080). However,
> non-breaking space is significantly better: it converts into all our
> supported server encodings except EUC_CN, EUC_JP, EUC_KR, EUC_TW.
> It seems likely that we won't do better than that except with a basic
> ASCII character.
>

Yeah, I just looked at the copyright symbol, with similar results.

Let's just stick to ASCII.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mikko Tiihonen 2014-06-01 22:22:58 Documenting the Frontend/Backend Protocol update criteria
Previous Message Tom Lane 2014-06-01 21:35:53 Re: plpython_unicode test (was Re: buildfarm / handling (undefined) locales)