Turkish locale bug

From: Sezai YILMAZ <sezaiy(at)ata(dot)cs(dot)hun(dot)edu(dot)tr>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Turkish locale bug
Date: 2001-02-19 11:50:05
Message-ID: 3A91086D.33155129@ata.cs.hun.edu.tr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Your name : Sezai YILMAZ
Your email address : sezaiy(at)ata(dot)cs(dot)hun(dot)edu(dot)tr

System Configuration
---------------------
Architecture (example: Intel Pentium) : AMD Duron

Operating System (example: Linux 2.0.26 ELF) : Linux 2.2.17 ELF

PostgreSQL version (example: PostgreSQL-7.0): PostgreSQL-7.0.3

Compiler used (example: gcc 2.8.0) : gcc 2.95.3

Please enter a FULL description of your problem:
------------------------------------------------

Locale support for Turkish causes a problem. The problem is with
character 'I' (capital of 9.th character of English alphabet).
When character 'I' is given to tolower() function and locale is
set to "tr_TR", it downgrades to special Turkish character 'ı'
(its is called "y acute"), not 'i'. This causes the following
problem:

With Turkish locale it is not possible to write SQL queries in
CAPITAL letters. SQL identifiers like "INSERT" and "UNION" first
are downgraded to "ınsert" and "unıon". Then "ınsert" and "unıon"
does not match as SQL identifier.

Please describe a way to repeat the problem. Please try to provide a
concise reproducible example, if at all possible:
----------------------------------------------------------------------

When you set "LC_ALL" environment variable to "tr_TR" this
problem happens.

If you know how this problem might be fixed, list the solution below:
---------------------------------------------------------------------

In file:

[postgresqlsourcepath]/src/backend/parser/scan.l

This block uses function tolower() which is affected by locale
settings of the shell which runs postmaster.

================================================================
{identifier} {
int i;
ScanKeyword *keyword;

for(i = 0; yytext[i]; i++)
if (isascii((unsigned char)yytext[i]) &&
isupper(yytext[i]))
yytext[i] = tolower(yytext[i]);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
================================================================

I think it should be better to use another thing which does what
function tolower() does but only in English language. This should
stay in English locale. I think this will solve the problem.

'a' - 'A' = 32

So we can use the following line instead of the last line marked
in above block.

yytext[i] += 32;

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Reinhard Max 2001-02-19 12:45:23 Re: NOTICE: pg_setlocale()
Previous Message Karel Zak 2001-02-19 08:47:51 Re: NOTICE: pg_setlocale()

Browse pgsql-hackers by date

  From Date Subject
Next Message Larry Rosenman 2001-02-19 13:23:46 Re: PHP 4.0.4pl1 / Beta 5
Previous Message Zeugswetter Andreas SB 2001-02-19 10:23:57 AW: GET DIAGNOSTICS (was Re: Open 7.1 items)