Quick Links

Re: proposal: UTF8 to_ascii function

From:	"Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
To:	"Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc:	"PostgreSQL-development Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: proposal: UTF8 to_ascii function
Date:	2008-08-11 14:48:03
Message-ID:	162867790808110748i1b4af15epbbd4c2ef0d9473bf@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

2008/8/11 Andrew Dunstan <andrew(at)dunslane(dot)net>:
>
>
> Pavel Stehule wrote:
>>
>>
>> One note - convert_to is correct. But we have to use to_ascii without
>> decode functions. It has same behave - convert from bytea to text.
>> Text in "incorrect" encoding is dafacto bytea. So correct to_ascii
>> function prototypes are:
>>
>> to_ascii(text)
>> to_ascii(bytea, integer);
>> to_ascii(bytea, name);
>>
>>
>>>
>>>
>
> What you have not said is how you propose to convert UTF8 to ASCII.
>
> Currently to_ascii() converts a small number of single byte charsets to
> ASCII by folding the chars with high bits set, so what we get is a pure
> ASCII result which is safe in any server encoding, as they are all ASCII
> supersets.
>
> But what conversion rule will you use for the gazillions of Unicode
> characters?
>
> I honestly do not understand the use case for this at all.
>

It's typical case in czech language, where some searchings are accents
insensitive - Stěhule, Stehule, Novotný, Novotny.

> cheers
>
> andrew
>

In response to

Re: proposal: UTF8 to_ascii function at 2008-08-11 13:17:28 from Andrew Dunstan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Pavel Stehule	2008-08-11 14:49:23	Re: proposal: UTF8 to_ascii function
Previous Message	Zdenek Kotala	2008-08-11 14:44:51	Re: Proposal: PageLayout footprint