Re: BUG #10707: UPPER() does not convert non-ASCII chars

Lists: pgsql-bugs
From: sf(at)4js(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #10707: UPPER() does not convert non-ASCII chars
Date: 2014-06-20 14:55:38
Message-ID: 20140620145538.2634.85619@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 10707
Logged by: FLAESCH Sebastien
Email address: sf(at)4js(dot)com
PostgreSQL version: 9.4beta1
Operating system: Linux Debian (3.14-1-amd64 #1 SMP Debian 3.14.4-1)
Description:

Created my test1 db with utf8 charset, when using the UPPER() function, only
ASCII chars are converted to uppercase.

I am missing a configuration option?

I have also 9.3.2 installed, and the characters are converted to uppercase.

test1=# SELECT pg_encoding_to_char(encoding) FROM pg_database WHERE datname
= 'test1';
pg_encoding_to_char
---------------------
UTF8
(1 row)

test1=# select upper('âãäåçèéêëô') ;
upper
------------
âãäåçèéêëô
(1 row)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: sf(at)4js(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #10707: UPPER() does not convert non-ASCII chars
Date: 2014-06-20 16:19:48
Message-ID: 6554.1403281188@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

sf(at)4js(dot)com writes:
> Created my test1 db with utf8 charset, when using the UPPER() function, only
> ASCII chars are converted to uppercase.

> I am missing a configuration option?

No, locale (particularly lc_ctype). You set that at initdb or database
creation time.

regards, tom lane


From: Sebastien FLAESCH <sf(at)4js(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #10707: UPPER() does not convert non-ASCII chars
Date: 2014-06-23 08:15:06
Message-ID: 53A7E20A.7060206@4js.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hello Amit,

Yes I found the problem, in fact I have created my db with:

$ createdb test1 -E utf8

Seems that I need to specify the locale explicitly with 9.4:

$ createdb test1 -E utf8 -l en_US.utf8

I think I did not use the -l option in 9.3.2 ...
Or maybe it defaults to the current LC_* settings in the shell and I used different
settings when I created the database with 9.3.2... I wonder...

So sorry for reporting this as a bug, it's ok now.

Thanks for the answer!
Seb

On 06/20/2014 05:59 PM, Amit Langote wrote:
> On Sat, Jun 21, 2014 at 12:51 AM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>> On Fri, Jun 20, 2014 at 11:55 PM, <sf(at)4js(dot)com> wrote:
>>> The following bug has been logged on the website:
>>>
>>> Bug reference: 10707
>>> Logged by: FLAESCH Sebastien
>>> Email address: sf(at)4js(dot)com
>>> PostgreSQL version: 9.4beta1
>>> Operating system: Linux Debian (3.14-1-amd64 #1 SMP Debian 3.14.4-1)
>>> Description:
>>>
>>> Created my test1 db with utf8 charset, when using the UPPER() function, only
>>> ASCII chars are converted to uppercase.
>>>
>>> I am missing a configuration option?
>>>
>>> I have also 9.3.2 installed, and the characters are converted to uppercase.
>>>
>>> test1=# SELECT pg_encoding_to_char(encoding) FROM pg_database WHERE datname
>>> = 'test1';
>>> pg_encoding_to_char
>>> ---------------------
>>> UTF8
>>> (1 row)
>>>
>>> test1=# select upper('âãäåçèéêëô') ;
>>> upper
>>> ------------
>>> âãäåçèéêëô
>>> (1 row)
>>>
>>
>> What locale is your database using? Is it 'C'?
>>
>> --
>> Amit
>
> Or what does following say:
>
> SELECT datctype FROM pg_database WHERE datname = 'test1';
>
> --
> Amit
>