Re: BUG #1931: ILIKE and LIKE fails on Turkish locale

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Devrim GUNDUZ" <devrim(at)gunduz(dot)org>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #1931: ILIKE and LIKE fails on Turkish locale
Date: 2005-10-01 16:31:41
Message-ID: 9028.1128184301@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-tr-genel

"Devrim GUNDUZ" <devrim(at)gunduz(dot)org> writes:
> http://sourceware.org/bugzilla/long_list.cgi?buglist=1354
> So it is PostgreSQL's bug or Glibc's?

Just offhand, iwchareq() seems several bricks shy of a load:

/*
* if one of them is an ASCII while the other is not, then they must
* be different characters
*/
else if ((unsigned char) *p1 < CHARMAX || (unsigned char) *p2 < CHARMAX)
return (0);

This test is wrong per Jakub's observation. Also, the code right below
that is using tolower() not towlower() on wide characters, which seems
pretty wrong. For that matter, towlower would be wrong too :-( because
there is no certainty that libc's idea of wide characters is the same as
pg_mb2wchar_with_len's.

So yeah, ILIKE looks just about completely broken for multibyte encodings.
Maybe it would be best to pass both strings through lower() and then
do a normal LIKE comparison?

The regexp code doesn't look better, btw, just differently broken ...

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Devrim GUNDUZ 2005-10-02 00:38:01 Re: BUG #1931: ILIKE and LIKE fails on Turkish locale
Previous Message Devrim GUNDUZ 2005-10-01 15:35:36 BUG #1931: ILIKE and LIKE fails on Turkish locale

Browse pgsql-tr-genel by date

  From Date Subject
Next Message Devrim GUNDUZ 2005-10-02 00:38:01 Re: BUG #1931: ILIKE and LIKE fails on Turkish locale
Previous Message Devrim GUNDUZ 2005-10-01 15:35:36 BUG #1931: ILIKE and LIKE fails on Turkish locale