Re: [GENERAL] russian case-insensitive regexp search not working

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: alexander lunyov <lan(at)startatom(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [GENERAL] russian case-insensitive regexp search not working
Date: 2007-07-12 13:09:23
Message-ID: Pine.LNX.4.64.0707121705230.20068@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Thu, 12 Jul 2007, alexander lunyov wrote:

> Oleg Bartunov wrote:
>> alexander,
>>
>> lc_ctype and lc_collate can be changed only at initdb !
>> You need to read localization chapter
>> http://www.postgresql.org/docs/current/static/charset.html
>
>
> Yes, i knew about this, but i thought maybe somehow it can be changed
> onthefly.
>
> ... (10 minutes later)
>
> Yes, now when initdb done with --locale=ru_RU.UTF-8, lower('RussianString')
> gives me 'russianstring', though, case-insensiive regexp still not working. I

confirmed, checked with --locale=ru_RU.UTF-8 and 8.2.4,CVS HEAD.
No problem with --locale ru_RU.KOI8-R

> guess i'll stick with lower() ~ lower() construction.
>
> And thanks everybody who replied!
>
>>
>>
>> Oleg
>> On Thu, 12 Jul 2007, alexander lunyov wrote:
>>
>>> Tom Lane wrote:
>>>> alexander lunyov <lan(at)startatom(dot)ru> writes:
>>>>> With this i just wanted to say that lower() doesn't work at all on
>>>>> russian unicode characters,
>>>>
>>>> In that case you're using the wrong locale (ie, not russian unicode).
>>>> Check "show lc_ctype".
>>>
>>> db=> SHOW LC_CTYPE;
>>> lc_ctype
>>> ----------
>>> C
>>> (1 запись)
>>>
>>> db=> SHOW LC_COLLATE;
>>> lc_collate
>>> ------------
>>> C
>>> (1 запись)
>>>
>>> Where can i change this? Trying to SET this parameters gives error
>>> "parameter "lc_collate" cannot be changed"
>>>
>>>> Or [ checks back in thread... ] maybe you're using the wrong operating
>>>> system. Not so long ago FreeBSD didn't have Unicode locale support at
>>>> all; I'm not sure if 6.2 has that problem but it is worth checking.
>>>> Does it work for you to do case-insensitive russian comparisons in
>>>> "grep", for instance?
>>>
>>> I put to textfile 3 russian strings with different case of first char and
>>> grep'ed them all:
>>>
>>> # cat > textfile
>>> Зеленая
>>> Зеленодольская
>>> зеленая
>>> # grep -i зелен *
>>> textfile:Зеленая
>>> textfile:Зеленодольская
>>> textfile:зеленая
>>>
>>> So i think system is fine about unicode.
>>>
>>>
>>
>> Regards,
>> Oleg
>> _____________________________________________________________
>> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
>> Sternberg Astronomical Institute, Moscow University, Russia
>> Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
>> phone: +007(495)939-16-83, +007(495)939-23-83
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Simon Riggs 2007-07-12 14:18:22 Re: Database corruption: finding the bad block
Previous Message Csaba Nagy 2007-07-12 13:09:16 Database corruption: finding the bad block

Browse pgsql-hackers by date

  From Date Subject
Next Message Stefan Kaltenbrunner 2007-07-12 13:25:17 compiler warnings on the buildfarm
Previous Message Magnus Hagander 2007-07-12 12:22:23 Re: Need help with autoconf