From: | Alvaro Herrera <alvherre(at)commandprompt(dot)com> |
---|---|
To: | mark(at)mark(dot)mielke(dot)cc |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Gregory Stark <gsstark(at)mit(dot)edu>, andrew(at)supernews(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Fixed length data types issue |
Date: | 2006-09-08 20:42:09 |
Message-ID: | 20060908204209.GH5892@alvh.no-ip.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
mark(at)mark(dot)mielke(dot)cc wrote:
> On Fri, Sep 08, 2006 at 02:39:03PM -0400, Alvaro Herrera wrote:
> > mark(at)mark(dot)mielke(dot)cc wrote:
> > > I think I've been involved in a discussion like this in the past. Was
> > > it mentioned in this list before? Yes the UTF-8 vs UTF-16 encoding
> > > means that UTF-8 applications are at a disadvantage when using the
> > > library. UTF-16 is considered more efficient to work with for everybody
> > > except ASCII users. :-)
> > Uh, is it? By whom? And why?
>
> The authors of the library in question? Java? Anybody whose primary
> alphabet isn't LATIN1 based? :-)
Well, for Latin-9 alphabets, Latin-9 is still more space-efficient than
UTF-8. That covers a lot of the world. Forcing those people to change
to UTF-16 does not strike me as a very good idea.
But Martijn already clarified that ICU does not actually force you to
switch everything to UTF-16, so this is not an issue anyway.
> Only ASCII values store more space efficiently in UTF-8. All values
> over 127 store more space efficiently using UTF-16. UTF-16 is easier
> to process. UTF-8 requires too many bit checks with single character
> offsets. I'm not an expert - I had this question before a year or two
> ago, and read up on the ideas of experts.
Well, I was not asking about "UTF-8 vs UTF-16," but rather "anything vs.
UTF-16". I don't much like UTF-8 myself, but that's not a very informed
opinion, just like a feeling of "fly-killing-cannon" (when it's used to
store Latin-9-fitting text).
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
From | Date | Subject | |
---|---|---|---|
Next Message | mark | 2006-09-08 20:46:04 | Re: Fixed length data types issue |
Previous Message | mark | 2006-09-08 20:31:23 | Re: Fixed length data types issue |