Re: spanish locale question

From: Andreas Joseph Krogh <andreak(at)officenet(dot)no>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: spanish locale question
Date: 2012-05-04 18:54:18
Message-ID: 4FA425DA.5040707@officenet.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 05/04/2012 07:31 PM, Tom Lane wrote:
> Al Eridani<al(dot)eridani(at)gmail(dot)com> writes:
>> What Tulio is saying is that 'leon' and 'león' are the same thing from
>> the point of view of sorting in Spanish, but his PostgreSQL seems to
>> think that 'leon' goes before 'león'.
> Postgres never considers that two distinct strings are "equal". If the
> locale setting considers these equal (which isn't entirely clear from
> the given evidence), PG would then sort them on the basis of their
> character code values.
>
> A possible workaround if you need to consider them equal is to strip the
> accents before sorting (ie, something like "ORDER BY to_ascii(col)") but
> this may well throw away more information than you want ...

Note that to_ascii barfs on unicode-input:

ERROR: encoding conversion from UTF8 to ASCII not supported

Better install unaccent:

cd ./postgresql-9.1.2/contrib/unaccent
make install
psql
CREATE EXTENSION unaccent;
andreak=# select unaccent('león');
unaccent
----------
leon
(1 row)

--
Andreas Joseph Krogh<andreak(at)officenet(dot)no> - mob: +47 909 56 963
Senior Software Developer / CEO - OfficeNet AS - http://www.officenet.no
Public key: http://home.officenet.no/~andreak/public_key.asc

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Pavel Stehule 2012-05-04 18:54:21 Re: set returning functions and resultset order
Previous Message John R Pierce 2012-05-04 18:16:11 Re: Re: Move the postgreSQL database from Drive C to Map Network Drive (Called Z)