From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org |
Subject: | Re: UTF8MatchText |
Date: | 2007-05-17 17:48:10 |
Message-ID: | 464C955A.6050402@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Tom Lane wrote:
> UTF8 has disjoint representations for
> first-bytes and not-first-bytes of MB characters, and thus it is
> impossible to make a false match in which an MB pattern character is
> matched to the end of one data character plus the start of another.
> In character sets without that property, we have to use the slow way to
> ensure we don't make out-of-sync matches.
>
>
>
Thanks. I will include this info in the comments.
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua D. Drake | 2007-05-17 17:57:29 | Re: Patch queue triage |
Previous Message | Tom Lane | 2007-05-17 17:39:41 | Re: UTF8MatchText |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-05-17 18:00:35 | Re: Seq scans status update |
Previous Message | Tom Lane | 2007-05-17 17:39:41 | Re: UTF8MatchText |