compare lower case and upper case when encoding is utf-8

Lists: pgsql-hackers
From: Quan Zongliang <quanzongliang(at)gmail(dot)com>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: compare lower case and upper case when encoding is utf-8
Date: 2012-06-16 08:21:15
Message-ID: 4FDC41FB.7060302@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi hackers,

I found that lower case is less than upper case when the db is created
with utf8.
I tried below
locale en_US.utf8 'A'<'a' false
locale ja_JP.utf8 'A'<'a' true
locale zh_CN.utf8 'A'<'a' false
Under Windows
locale Chinese_China 'A'<'a' false

I am not sure it is normal or not.
But in Chinese, the lower case should be greater than upper, same as
locale C.

I made some code try to fix it.
It seems to work fine.

Quan Zongliang


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Quan Zongliang <quanzongliang(at)gmail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compare lower case and upper case when encoding is utf-8
Date: 2012-06-17 23:13:16
Message-ID: 1339974796.18469.3.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On lör, 2012-06-16 at 16:21 +0800, Quan Zongliang wrote:
> I found that lower case is less than upper case when the db is
> created
> with utf8.
> I tried below
> locale en_US.utf8 'A'<'a' false
> locale ja_JP.utf8 'A'<'a' true
> locale zh_CN.utf8 'A'<'a' false
> Under Windows
> locale Chinese_China 'A'<'a' false
>
> I am not sure it is normal or not.
> But in Chinese, the lower case should be greater than upper, same as
> locale C.

The operating system locale determines that, so you need to look there
if you don't agree with the result.

http://wiki.postgresql.org/wiki/FAQ#Why_do_my_strings_sort_incorrectly.3F


From: Quan Zongliang <quanzongliang(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: compare lower case and upper case when encoding is utf-8
Date: 2012-06-18 01:44:26
Message-ID: 4FDE87FA.6080101@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012/6/18 7:13, Peter Eisentraut wrote:
> On lör, 2012-06-16 at 16:21 +0800, Quan Zongliang wrote:
>> I found that lower case is less than upper case when the db is
>> created
>> with utf8.
>> I tried below
>> locale en_US.utf8 'A'<'a' false
>> locale ja_JP.utf8 'A'<'a' true
>> locale zh_CN.utf8 'A'<'a' false
>> Under Windows
>> locale Chinese_China 'A'<'a' false
>>
>> I am not sure it is normal or not.
>> But in Chinese, the lower case should be greater than upper, same as
>> locale C.
> The operating system locale determines that, so you need to look there
> if you don't agree with the result.
>
> http://wiki.postgresql.org/wiki/FAQ#Why_do_my_strings_sort_incorrectly.3F
>
>
I see, thank you.

Quan Zongliang