From: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hannu Krosing <hannu(at)skype(dot)net>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: LIKE optimization in UTF-8 and locale-C |
Date: | 2007-03-23 03:25:25 |
Message-ID: | 20070323104410.635A.ITAGAKI.TAKAHIRO@oss.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Hannu Krosing <hannu(at)skype(dot)net> wrote:
> > > We've had an optimization for single-byte encodings using
> > > pg_database_encoding_max_length() == 1 test. I'll propose to extend it
> > > in UTF-8 with locale-C case.
> >
> > If this works for UTF8, won't it work for all the backend-legal
> > encodings?
>
> I guess it works well for % but not for _ , the latter has to know, how
> many bytes the current (multibyte) character covers.
Yes, % is not used in trailing bytes for all encodings, but _ is
used in some of them. I think we can use the optimization for all
of the server encodings except JOHAB.
Also, I took notice that locale settings are not used in LIKE matching,
so the following is enough for checking availability of byte-wise matching
functions. or am I missing something?
#define sb_match_available() (GetDatabaseEncoding() == PG_JOHAB))
Multi-byte encodings supported by a server encoding.
| % 0x25 | _ 0x5f | \ 0x5c |
--------------+--------+--------+--------+-
EUC_JP | unused | unused | unused |
EUC_CN | unused | unused | unused |
EUC_KR | unused | unused | unused |
EUC_TW | unused | unused | unused |
JOHAB | unused | *used* | *used* |
UTF8 | unused | unused | unused |
MULE_INTERNAL | unused | unused | unused |
Just for reference, encodings only supported as a client encoding.
| % 0x25 | _ 0x5f | \ 0x5c |
--------------+--------+--------+--------+-
SJIS | unused | *used* | *used* |
BIG5 | unused | *used* | *used* |
GBK | unused | *used* | *used* |
UHC | unused | unused | unused |
GB18030 | unused | *used* | *used* |
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Pavan Deolasee | 2007-03-23 03:57:59 | Re: CREATE INDEX and HOT - revised design |
Previous Message | Russell Smith | 2007-03-23 02:03:25 | Re: CREATE INDEX and HOT (was Question: pg_classattributes and race conditions ?) |
From | Date | Subject | |
---|---|---|---|
Next Message | Dennis Bjorklund | 2007-03-23 05:17:26 | Re: LIKE optimization in UTF-8 and locale-C |
Previous Message | Bruce Momjian | 2007-03-23 03:16:30 | Cleanup to procarray.c |