From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | andrew(at)dunslane(dot)net |
Cc: | kleptog(at)svana(dot)org, markokr(at)gmail(dot)com, peter_e(at)gmx(dot)net, pgsql-hackers(at)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us |
Subject: | Re: Unicode string literals versus the world |
Date: | 2009-04-16 04:36:07 |
Message-ID: | 20090416.133607.115917738.t-ishii@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> >>> I still stand on my proposal, how about extending E'' strings with
> >>> unicode escapes (eg. \uXXXX)? The E'' strings are already more
> >>> clearly defined than '' and they are our "own", we don't need to
> >>> consider random standards, but can consider our sanity.
> >>>
> >> I suspect there would be lots more support in the user community, where
> >> \uXXXX is well understood in a number of contexts (Java and ECMAScript,
> >> for example). It's also tolerably sane.
> >>
> >
> > By the way, that's an example of how to do it wrong, there are more
> > than 2^16 unicode characters, you want to be able to support the full
> > 21-bit range if you're going to do it right.
> >
> > FWIW, I prefer the perl syntax which simply extends \x: \x{1344}, which
> > makes it clear it's hex and doesn't make assumptions as to how many
> > characters are used.
> >
>
> I could live with either. Wikipedia says: "The characters outside the
> first plane usually have very specialized or rare use." For years we
> rejected all characters beyond the first plane, and while that's fixed
> now, the volume of complaints wasn't huge.
I you mean "first plane" as BMP (i.e. 16bit range), above is not true
for PostgreSQL 7.3 or later at least.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
From | Date | Subject | |
---|---|---|---|
Next Message | Tatsuo Ishii | 2009-04-16 04:52:02 | Re: Unicode string literals versus the world |
Previous Message | Greg Sabino Mullane | 2009-04-16 03:33:51 | Re: NOTIFY / LISTEN silently parses and discards schema-ish portion of notification name ... |