Re: Fixed length data types issue

From: mark(at)mark(dot)mielke(dot)cc
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Gregory Stark <gsstark(at)mit(dot)edu>, andrew(at)supernews(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Fixed length data types issue
Date: 2006-09-08 20:46:04
Message-ID: 20060908204604.GA17518@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 08, 2006 at 04:42:09PM -0400, Alvaro Herrera wrote:
> mark(at)mark(dot)mielke(dot)cc wrote:
> > The authors of the library in question? Java? Anybody whose primary
> > alphabet isn't LATIN1 based? :-)
> Well, for Latin-9 alphabets, Latin-9 is still more space-efficient than
> UTF-8. That covers a lot of the world. Forcing those people to change
> to UTF-16 does not strike me as a very good idea.

Ah. Thought you were talking UTF-8 vs UTF-16.

> But Martijn already clarified that ICU does not actually force you to
> switch everything to UTF-16, so this is not an issue anyway.

If my memory is correct, it does this by converting it to UTF-16 first.
This is a performance disadvantage (although it may not be worse than
PostgreSQL's current implementation :-) ).

> > Only ASCII values store more space efficiently in UTF-8. All values
> > over 127 store more space efficiently using UTF-16. UTF-16 is easier
> > to process. UTF-8 requires too many bit checks with single character
> > offsets. I'm not an expert - I had this question before a year or two
> > ago, and read up on the ideas of experts.
> Well, I was not asking about "UTF-8 vs UTF-16," but rather "anything vs.
> UTF-16". I don't much like UTF-8 myself, but that's not a very informed
> opinion, just like a feeling of "fly-killing-cannon" (when it's used to
> store Latin-9-fitting text).

*nod*

Cheers,
mark

--
mark(at)mielke(dot)cc / markm(at)ncf(dot)ca / markm(at)nortel(dot)com __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gevik Babakhani 2006-09-08 20:46:39 Re: Proposal for GUID datatype
Previous Message Alvaro Herrera 2006-09-08 20:42:09 Re: Fixed length data types issue