Lists: | pgsql-hackers |
---|
From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org> |
Subject: | Non-C locale and LIKE |
Date: | 2004-11-28 04:52:56 |
Message-ID: | 200411280452.iAS4quv12028@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
I know we can't currently use an index with non-C locales and LIKE
except when we create a sepcial type of index for LIKE indexing
(text_pattern_ops).
However, I am wondering if we should create a character lookup during
initdb that has the characters ordered so we can do:
col LIKE 'ha%' AND col >= "ha" and col <= "hb"
Could we do this easily for single-character encodings? We could have:
A 1
B 2
C 3
and a non-C locale could be:
A 1
A` 2
B 3
We can't handle multi-byte encodings because the number of combinations
is too large or not known.
Also, we mention you should use the "C" locale to use normal indexes for
LIKE but isn't it more correct to say the encoding has to be SQL_ASCII?
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
From: | Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> |
---|---|
To: | pgman(at)candle(dot)pha(dot)pa(dot)us |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Non-C locale and LIKE |
Date: | 2004-11-28 08:25:24 |
Message-ID: | 20041128.172524.74751187.t-ishii@sra.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
> I know we can't currently use an index with non-C locales and LIKE
> except when we create a sepcial type of index for LIKE indexing
> (text_pattern_ops).
>
> However, I am wondering if we should create a character lookup during
> initdb that has the characters ordered so we can do:
>
> col LIKE 'ha%' AND col >= "ha" and col <= "hb"
>
> Could we do this easily for single-character encodings? We could have:
>
> A 1
> B 2
> C 3
>
> and a non-C locale could be:
>
> A 1
> A` 2
> B 3
>
> We can't handle multi-byte encodings because the number of combinations
> is too large or not known.
>
> Also, we mention you should use the "C" locale to use normal indexes for
> LIKE but isn't it more correct to say the encoding has to be SQL_ASCII?
Why? "C" locale works well for multibyte encodings such as EUC-JP too.
--
Tatsuo Ishii
From: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
---|---|
To: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org> |
Subject: | Re: Non-C locale and LIKE |
Date: | 2004-11-28 08:29:49 |
Message-ID: | 200411280929.49995.peter_e@gmx.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Bruce Momjian wrote:
> However, I am wondering if we should create a character lookup during
> initdb that has the characters ordered so we can do:
That won't work. Real-life collations are too complicated.
> Also, we mention you should use the "C" locale to use normal indexes
> for LIKE but isn't it more correct to say the encoding has to be
> SQL_ASCII?
No, the locale decides the ordering.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org> |
Subject: | Re: Non-C locale and LIKE |
Date: | 2004-11-28 14:35:58 |
Message-ID: | 200411281435.iASEZwP22492@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Peter Eisentraut wrote:
> Bruce Momjian wrote:
> > However, I am wondering if we should create a character lookup during
> > initdb that has the characters ordered so we can do:
>
> That won't work. Real-life collations are too complicated.
OK.
> > Also, we mention you should use the "C" locale to use normal indexes
> > for LIKE but isn't it more correct to say the encoding has to be
> > SQL_ASCII?
>
> No, the locale decides the ordering.
Oh, OK.
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073