Proof of concept COLLATE support with patch

Lists: pgsql-hackers
From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Proof of concept COLLATE support with patch
Date: 2005-09-02 13:04:21
Message-ID: 20050902130420.GA15466@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Supports any glibc platform and possibly Win32.

Adds:
SELECT ... ORDER BY expr COLLATE 'locale'
CREATE INDEX locale_index ON table(expr COLLATE 'locale')
Index scan used when COLLATE order permits

This is just a proof of concept patch. I didn't send it to -patches
because as Tom pointed out, there's no hope of it getting in due to
platform dependant behaviour.

This patch does not use setlocale and is completely orthoganal to any
locale support already in the backend.

As it turns out, meaningful locale support only needs a handful of
support functions to work. These are listed at the bottom. My patch
only uses the first two, but the third will be needed at some stage.
The use of the last one depends on how the backend ends up support
locales. Both glibc and wine32 have locale sensetive versions of many
functions including:

toupper_l, tolower_l, strfmon_l, strtoul_l, strtof_l, strftime_l, is*_l

A windows function list is at:
http://msdn2.microsoft.com/library/wyzd2bce(en-us,vs.80).aspx

Patch available here:
http://svana.org/kleptog/pgsql/collate1.patch

Implementation notes follow and table of functions is at the bottom.

I hope this helps whenever someone gets around to full COLLATE support.

Have a nice day,

Notes:
* It works by replacing (expr COLLATE 'locale') with
pg_strxfrm(expr, pg_findlocale(locale))
in the parsetree.

pg_findlocale returns an opaque pointer to the locale. It is
STRICT IMMUTABLE and is optimised away in the final query.

pg_strxfrm takes the string and the locale and returns a bytea.
bytea comparison uses memcmp so is safe from other locale effects
in the backend.

* Use of COLLATE for an index will probably double the diskspace
required for that index due to the strxfrm.

* I had to add the functions to pg_proc.h because CREATE FUNCTION
couldn't find them. So they have OIDs I made up. You may need to
initdb, I'm not sure.

You can compile pg_xlocale.c as an shared object and load them
that way too if you want to avoid the initdb.

* Internally they are defined as taking and returning "internal".
CREATE FUNCTION doesn't like that so specify opaque or oid
instead. The declarations are:

create function pg_findlocale(text) returns oid as 'pg_findlocale' language internal strict immutable;
create function pg_strxfrm(text,oid) returns bytea as 'pg_strxfrm' language internal strict immutable;

* The clause ORDER BY 1 COLLATE 'en_AU' breaks, it treats the 1 like
a constant. I couldn't quickly work out how to reference the
columns the right way. Long term that code should be in the
sorting code anyway.

* The locale needs to be in quotes, otherwise the parser converts it
to lower-case. Locale names are case-sensetive on many systems.

* There is a text function strcoll_l for testing collation:

create function pg_strcoll_l(text,text,text) returns int4 as 'pg_strcoll_l' language internal strict immutable;

* Yes this is the easy way out, implementing the inheritence of the
COLLATE attribute will be much more invasive. This gives most
people what they want though.

* Although these functions are documented on Windows, they are not
for glibc, so it is an unstable insterface.

Function Needed glibc Win32
---------------------------------------------------------------------
Function returing opaque newlocale _create_locale
pointer to locale data

strxfrm with locale parameter strxfrm_l _strxfrm_l

Method finding encoding for nl_langinfo_l ???
locale

strcoll with locale parameter strcoll_l _strcoll_l

--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proof of concept COLLATE support with patch
Date: 2005-09-02 15:13:47
Message-ID: 87k6hzzems.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:

> Supports any glibc platform and possibly Win32.
>
> Adds:
> SELECT ... ORDER BY expr COLLATE 'locale'
> CREATE INDEX locale_index ON table(expr COLLATE 'locale')
> Index scan used when COLLATE order permits
>
> This is just a proof of concept patch. I didn't send it to -patches
> because as Tom pointed out, there's no hope of it getting in due to
> platform dependant behaviour.
>
> This patch does not use setlocale and is completely orthoganal to any
> locale support already in the backend.

I still doesn't get where the hostility towards this functionality comes from.
Just because some platforms provide a better interface than others doesn't
mean Postgres shouldn't do the best it can with what's available.

If there were an autoconf test for the *_l functions and a failover to calling
setlocale (safely protected) then it's just an issue that the feature will be
faster on some platforms than others. It'll still be the same behaviour on all
platforms. So there's no actual platform dependent Postgres behaviour.

Should readline support be ripped out because not every platform will have
readline? Or O_DIRECT support? Or unix domain socket support?

--
greg


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proof of concept COLLATE support with patch
Date: 2005-09-02 15:42:21
Message-ID: 14696.1125675741@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> I still doesn't get where the hostility towards this functionality comes from.

We're not really willing to say "here is a piece of syntax REQUIRED
BY THE SQL SPEC which we only support on some platforms". readline,
O_DIRECT, and the like are a completely inappropriate analogy, because
those are inherently platform-dependent (and not in the spec).

The objection is fundamentally that a platform-specific implementation
cannot be our long-term goal, and so expending effort on creating one
seems like a diversion. If there were a plan put forward showing how
this is just a useful way-station, and we could see how we'd later get
rid of the glibc dependency without throwing away the work already done,
then it would be a different story.

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proof of concept COLLATE support with patch
Date: 2005-09-02 15:53:00
Message-ID: 20050902155255.GB15466@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Sep 02, 2005 at 03:04:20PM +0200, Martijn van Oosterhout wrote:
> Supports any glibc platform and possibly Win32.

MacOS X [1] supports this also apparently. And for glibc it appears to
have been accepted as part of the API since 2.3.2 and formally accepted
into LSB3.0. Win32 claims to have supported this since '98.

But even though the MacOS X manpage says "BSD Library Functions" at the
top of the page, neither FreeBSD or OpenBSD doesn't appear to have it
at all. Not really a lot of chance that we could pull portions of the
Darwin libc into PostgreSQL, huh?

Maybe the easiest thing would be to download the libc locale support of
one of the BSDs, remove the global variable and use that...

[1] http://www.hmug.org/man/3/newlocale.php

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proof of concept COLLATE support with patch
Date: 2005-09-02 16:44:00
Message-ID: 15183.1125679440@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> [1] http://www.hmug.org/man/3/newlocale.php

Hmm, the more general page seems to be

http://www.hmug.org/man/3/xlocale.php

This seems to be pretty much exactly what we want, at least API-wise.
Now, if we can find an implementation of this with a BSD license ;-) ...

[ I don't recall at the moment whether Apple publishes all of Darwin
under a straight BSD license, but that would surely be a good place to
look first. ]

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proof of concept COLLATE support with patch
Date: 2005-09-02 17:04:42
Message-ID: 20050902170437.GC15466@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Sep 02, 2005 at 12:44:00PM -0400, Tom Lane wrote:
>
> Hmm, the more general page seems to be
>
> http://www.hmug.org/man/3/xlocale.php
>
> This seems to be pretty much exactly what we want, at least API-wise.
> Now, if we can find an implementation of this with a BSD license ;-) ...

Yes it is, it's exactly the same interface as glibc. Windows has them
all with an underscore prefix.

> [ I don't recall at the moment whether Apple publishes all of Darwin
> under a straight BSD license, but that would surely be a good place to
> look first. ]

libc is listed as APSL licence, whatever that means. Something with
that many clauses can't be BSD compatable.

What I wonder is how come Apple implemented all this in their version
yet none of the BSDs got around to it.

I've looked around for Citrus, it appears that NetBSD contains the
latest version and while there's a lot of stuff for LC_CTYPE and charset
conversion, LC_COLLATE didn't appear to be high on their priorities.

I especially liked these fragments from the OpenBSD and NetBSD CVS
repositories. Tom, you've comvinced me, relying on the platform is
silly. We have platforms that don't support LC_COLLATE in one locale,
let alone multiple. FreeBSD thankfully does support it.

http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/string/strcoll.c?rev=HEAD
http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libc/string/strcoll.c?rev=HEAD
--- snip ---
/*
* Compare strings according to LC_COLLATE category of current locale.
*/
int
strcoll(s1, s2)
const char *s1, *s2;
{

_DIAGASSERT(s1 != NULL);
_DIAGASSERT(s2 != NULL);

/* LC_COLLATE is unimplemented, hence always "C" */
return (strcmp(s1, s2));
}

--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: AgentM <agentm(at)themactionfaction(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept COLLATE support with patch
Date: 2005-09-02 17:11:21
Message-ID: 9ECD5726-CB80-4B5D-A0A4-1DF1AAC75014@themactionfaction.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The sources can be found here:
http://darwinsource.opendarwin.org/10.4.2/Libc-391/locale/xlocale.c

The Apple License *is* necessarily compatible with the BSD License.
http://www.gnu.org/philosophy/apsl.html

On Sep 2, 2005, at 11:44 AM, Tom Lane wrote:

> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
>
>> [1] http://www.hmug.org/man/3/newlocale.php
>>
>
> Hmm, the more general page seems to be
>
> http://www.hmug.org/man/3/xlocale.php
>
> This seems to be pretty much exactly what we want, at least API-wise.
> Now, if we can find an implementation of this with a BSD
> license ;-) ...
>
> [ I don't recall at the moment whether Apple publishes all of Darwin
> under a straight BSD license, but that would surely be a good place to
> look first. ]
>
> regards, tom lane

|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-
AgentM
agentm(at)themactionfaction(dot)com
|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: AgentM <agentm(at)themactionfaction(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept COLLATE support with patch
Date: 2005-09-02 20:49:03
Message-ID: 200509022049.j82Kn3I11137@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

AgentM wrote:
> The sources can be found here:
> http://darwinsource.opendarwin.org/10.4.2/Libc-391/locale/xlocale.c
>
> The Apple License *is* necessarily compatible with the BSD License.
> http://www.gnu.org/philosophy/apsl.html

Does compatibile mean our combined work is still BSD licensed?

---------------------------------------------------------------------------

>
> On Sep 2, 2005, at 11:44 AM, Tom Lane wrote:
>
> > Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> >
> >> [1] http://www.hmug.org/man/3/newlocale.php
> >>
> >
> > Hmm, the more general page seems to be
> >
> > http://www.hmug.org/man/3/xlocale.php
> >
> > This seems to be pretty much exactly what we want, at least API-wise.
> > Now, if we can find an implementation of this with a BSD
> > license ;-) ...
> >
> > [ I don't recall at the moment whether Apple publishes all of Darwin
> > under a straight BSD license, but that would surely be a good place to
> > look first. ]
> >
> > regards, tom lane
>
> |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-
> AgentM
> agentm(at)themactionfaction(dot)com
> |-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: AgentM <agentm(at)themactionfaction(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept COLLATE support with p.tch
Date: 2005-09-02 21:40:46
Message-ID: 20050902214046.GD30425@surnet.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Sep 02, 2005 at 04:49:03PM -0400, Bruce Momjian wrote:
> AgentM wrote:
> > The sources can be found here:
> > http://darwinsource.opendarwin.org/10.4.2/Libc-391/locale/xlocale.c
> >
> > The Apple License *is* necessarily compatible with the BSD License.
> > http://www.gnu.org/philosophy/apsl.html
>
> Does compatibile mean our combined work is still BSD licensed?

No, because of clause 2.2 (c) of the APSL, at least. (Must distribute
source code if modified.)

--
Alvaro Herrera -- Valdivia, Chile Architect, www.EnterpriseDB.com
Dios hizo a Adán, pero fue Eva quien lo hizo hombre.


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Locale implementation questions (was: Proof of concept COLLATE support with patch)
Date: 2005-09-03 20:34:40
Message-ID: 20050903203434.GA4281@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Sep 02, 2005 at 11:42:21AM -0400, Tom Lane wrote:
> The objection is fundamentally that a platform-specific implementation
> cannot be our long-term goal, and so expending effort on creating one
> seems like a diversion. If there were a plan put forward showing how
> this is just a useful way-station, and we could see how we'd later get
> rid of the glibc dependency without throwing away the work already done,
> then it would be a different story.

Well, my patch showed that useful locale work can be acheived with
precisely two functions: newlocale and strxfrm_l.

I'm going to talk about two things: one, the code from Apple. Two, how
we present locale support to users.
---
Now, it would be really nice to take Apple's implementation in Darwin
and use that. What I don't understand is the licence of the code in
Darwin. My interpretation is that stuff in:

http://darwinsource.opendarwin.org/10.4.2/Libc-391/locale/

is Apple stuff under APSL, useless to us. And that stuff in:

http://darwinsource.opendarwin.org/10.4.2/Libc-391/locale/FreeBSD/

are just patches to FreeBSD and this under the normal BSD license (no
big header claiming the licence change). The good news is that the
majority of what we need is in patch form. The bad news is that the hub
of the good stuff (newlocale, duplocale, freelocale) is under a big fat
APSL licence.

Does anyone know if this code can be used at all by BSD projects or did
they blanket relicence everything?
---
Now, I want to bring up some points relating to including a locale
library in PostgreSQL. Given that none of the BSDs seem really
interested in fixing the issue we'll have to do it ourselves (I don't
see anyone else doing it). We can save ourselves effort by basing it on
FreeBSDs locale code, because then we can use their datafiles, which we
*definitly* don't want to maintain ourselves. Now:

1. FreeBSDs locale list is short, some 48 compared with glibc's 217.
Hopefully Apple can expand on that in a way we can use. But given the
difference we should probably give people a way of falling back to the
system libraries in case there's a locale we don't support.

On the other hand, lots of locales are similar so maybe people can find
ones close enough to work. No, glibc and FreeBSD use different file
formats, so you can't copy them.

Do we want this locale data just for collation, or do we want to be
able to use it for formatting monetary amounts too? This is even more
info to store. Lots of languages use ISO/IEC 14651 for order.

2. Locale data needs to be combined with a charset and compiled to work
with the library. PostgreSQL supports at least 15 charsets but we don't
want to ship compiled versions of all of these (Debian learnt that the
hard way). So, how do we generate the files people need.

a. Auto-compile on demand. First time a locale is referenced spawn
the compiler to create the locale, then continue. (Ugh)
b. Add a CREATE LOCALE english AS 'en_US' WITH CHARSET 'utf8'. Then
require the COLLATE clause to refer to this identifier. This has some
appeal, seperating the system names from the PostgreSQL names. It also
gives some info regarding charsets.
c. Should users be allowed to define new locales?
d. Should admins be required to create the external files using a
program, say pg_createlocale.

Remember, if you use a latin1 locale to sort utf8 you'll get the wrong
result, so we want to avoid that.

3. Compiled locale files are large. One UTF-8 locale datafile can
exceed a megabyte. Do we want the option of disabling it for small
systems?

4. Do we want the option of running system locale in parallel with the
internal ones?

5. I think we're going to have to deal with the very real possibility
that our locale database will not be as good as some of the system
provided ones. The question is how. This is quite unlike timezones
which are quite standardized and rarely change. That database is quite
well maintained.

Would people object to a configure option that selected:
--with-locales=internal (use pg database)
--with-locales=system (use system database for win32, glibc or MacOS X)
--with-locales=none (what we support now, which is neither)

I don't think it will be much of an issue to support this, all the
functions take the same parameters and have almost the same names.

6. Locales for SQL_ASCII. Seems to me you have two options, either
reject COLLATE altogether unless they specify a charset, or don't care
and let the user shoot themselves in the foot if they wish...

BTW, this MacOS locale supports seems to be new for 10.4.2 according to
the CVS log info, can anyone confirm this?

Anyway, I hope this post didn't bore too much. Locale support has been
one of those things that has bugged me for a long time and it would be
nice if there could be some real movement.

Have a nice weekend,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proof of concept COLLATE support with patch
Date: 2005-09-03 21:36:02
Message-ID: 87u0h1ygu5.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Greg Stark <gsstark(at)mit(dot)edu> writes:
> > I still doesn't get where the hostility towards this functionality comes from.
>
> We're not really willing to say "here is a piece of syntax REQUIRED
> BY THE SQL SPEC which we only support on some platforms". readline,
> O_DIRECT, and the like are a completely inappropriate analogy, because
> those are inherently platform-dependent (and not in the spec).

But that's not the case at all. The syntax can be supported everywhere it
would just be somewhat faster on some platforms than others. It's already
reasonably fast on any platform that caches locale information which includes
glibc and presumably other free software libcs. It would be slightly faster if
there are _l functions. And much slower if the libc setlocale implementation
is braindead. But there's nothing wrong with saying "it's slow because your
libc is slow. Compile with this freely available library which has a better
implementation". The programming syntax would still be exactly 100% the same.

> The objection is fundamentally that a platform-specific implementation
> cannot be our long-term goal, and so expending effort on creating one
> seems like a diversion. If there were a plan put forward showing how
> this is just a useful way-station, and we could see how we'd later get
> rid of the glibc dependency without throwing away the work already done,
> then it would be a different story.

It's not like the actual calls to setlocale are going to be much code. One day
presumably some variant of these _l functions will become entirely standard.
In which case you're talking about potentially "throwing away" 50 lines of
code. The bulk of the code is going to be parsing and implementing the actual
syntax and behaviour of the SQL spec. And in any case I wouldn't expect it to
ever get thrown away. There will be people compiling on RH9 or similar vintage
systems for a long time.

--
greg


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Locale implementation questions (was: Proof of concept COLLATE support with patch)
Date: 2005-09-03 21:44:50
Message-ID: 87oe79ygfh.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:

> 2. Locale data needs to be combined with a charset and compiled to work
> with the library. PostgreSQL supports at least 15 charsets but we don't
> want to ship compiled versions of all of these (Debian learnt that the
> hard way). So, how do we generate the files people need.

That's just one of many lessons learned the hard way by distributions. Nor
will it be the last innovation in this area.

I really find this instinct of wanting to reimplement large swaths of the OS
inside Postgres (and annoying detail-ridden swaths that are hard to get right
and continually evolving too) to be a bad idea.

I can't believe it's harder to maintain an

#ifdef HAVE_STRCOL_L
#else
#endif

than it is to try to maintain an entire independent locale library. Nor is it
simpler for sysadmins to have to maintain an entirely separate set of locales
independently from the system locales.

If you really are unhappy enough with OS setlocale implementations to want to
try to do this then it would be more helpful to do it outside of Postgres.
Package up the Apple setlocale library as a separate package that anyone can
install on Solaris, BSD, Linux or whatever. Then Postgres can just say "it
works fine with your OS library but your OS library might be very slow. Here's
a third-party library that you can install that is fast and may relieve any
problems you have with collation performance."

But I think that's getting ahead of things. Until Postgres even supports
collations using the OS libraries you won't even know if that's even
necessary.

--
greg


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Locale implementation questions (was: Proof of concept COLLATE support with patch)
Date: 2005-09-04 12:31:05
Message-ID: 20050904123103.GA21198@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Sep 03, 2005 at 05:44:50PM -0400, Greg Stark wrote:
> [...] Nor is it
> simpler for sysadmins to have to maintain an entirely separate set of locales
> independently from the system locales.

Indeed, I was already coming up with mechanisms to determine what
locales the system uses and try to autogenerate them. I agree though,
it's not useful for systems that already have complete locale support.
Why add to the burden?

Anyway, my reading of the specs says that we must support the syntax.
It doesn't say we need to support any orderings other than the default
(ie what we do now).

> If you really are unhappy enough with OS setlocale implementations to want to
> try to do this then it would be more helpful to do it outside of Postgres.
> Package up the Apple setlocale library as a separate package that anyone can
> install on Solaris, BSD, Linux or whatever. Then Postgres can just say "it
> works fine with your OS library but your OS library might be very slow. Here's
> a third-party library that you can install that is fast and may relieve any
> problems you have with collation performance."

That's why I asked about the patches and files that Apple wrote. What
are the licence restrictions? Would we be able to download the, what,
20 files and distribute it as a library. Being APSL we couldn't include
it in the tarball, but it could be a pgfoundry project or something.

If somebody knows a reason why this could not be done, speak up now because
my reading of the APSL licence tells me it's fine.

> But I think that's getting ahead of things. Until Postgres even supports
> collations using the OS libraries you won't even know if that's even
> necessary.

Well, I added COLLATE support for ORDER BY and CREATE INDEX and it
worked in under 200 lines. I'm thinking ahead and I don't think the
COLLATE rules are that hard. Implementing them seems a bit fiddly. It
may be easiest to consider COLLATE a non-associative operator.

I'm still unsure if I should turn the string comparison operators into
three-argument functions.

Anyway, I'll look into the library issue first.
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: kleptog(at)svana(dot)org
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, gsstark(at)mit(dot)edu, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Locale implementation questions
Date: 2005-09-04 13:25:36
Message-ID: 20050904.222536.39155679.ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> 3. Compiled locale files are large. One UTF-8 locale datafile can
> exceed a megabyte. Do we want the option of disabling it for small
> systems?

To avoid the problem, you could dynmically load the compiled
tables. The charset conversion tables are handled similar way.

Also I think it's important to allow user defined collate data. To
implement the CREATE COLLATE syntax, we need to have that capability
anyway.

> 4. Do we want the option of running system locale in parallel with the
> internal ones?
>
> 5. I think we're going to have to deal with the very real possibility
> that our locale database will not be as good as some of the system
> provided ones. The question is how. This is quite unlike timezones
> which are quite standardized and rarely change. That database is quite
> well maintained.
>
> Would people object to a configure option that selected:
> --with-locales=internal (use pg database)
> --with-locales=system (use system database for win32, glibc or MacOS X)
> --with-locales=none (what we support now, which is neither)
>
> I don't think it will be much of an issue to support this, all the
> functions take the same parameters and have almost the same names.

To be honest, I don't understand why we have to rely on (often broken)
system locales. I don't think building our own locale data is too
hard, and once we make up it, the maintenace cost will be very small
since it should not be changed regularly. Moreover we could enjoy the
benefit that PostgreSQL handles collations in a corret manner on any
platform which PostgreSQL supports.

> 6. Locales for SQL_ASCII. Seems to me you have two options, either
> reject COLLATE altogether unless they specify a charset, or don't care
> and let the user shoot themselves in the foot if they wish...
>
> BTW, this MacOS locale supports seems to be new for 10.4.2 according to
> the CVS log info, can anyone confirm this?
>
> Anyway, I hope this post didn't bore too much. Locale support has been
> one of those things that has bugged me for a long time and it would be
> nice if there could be some real movement.

Right. We Japanese (and probably Chinese too) have been bugged by the
broken mutibyte locales for long time. Using C locale help us to a
certain extent, but for Unicode we need correct locale data, othewise
the sorted data will be completely chaos.
--
SRA OSS, Inc. Japan
Tatsuo Ishii


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, gsstark(at)mit(dot)edu, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Locale implementation questions
Date: 2005-09-04 15:01:13
Message-ID: 20050904150055.GB21198@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Sep 04, 2005 at 10:25:36PM +0900, Tatsuo Ishii wrote:
> > 3. Compiled locale files are large. One UTF-8 locale datafile can
> > exceed a megabyte. Do we want the option of disabling it for small
> > systems?
>
> To avoid the problem, you could dynmically load the compiled
> tables. The charset conversion tables are handled similar way.

That's not the point, ofcourse they are loaded dynamically. The
question is, when do we create the files in the first place. There are
48*15 = 750 combinations which would amount to tens of megabytes of
essentially useless data. *When* you create the files is an important
question. Compile time is out.

Charset conversion is completely different, there just arn't that many
combinations.

> Also I think it's important to allow user defined collate data. To
> implement the CREATE COLLATE syntax, we need to have that capability
> anyway.

Most OS's allow you to create collate data yourself anyway, why do we
need to implement this too?

> To be honest, I don't understand why we have to rely on (often broken)
> system locales. I don't think building our own locale data is too
> hard, and once we make up it, the maintenace cost will be very small
> since it should not be changed regularly. Moreover we could enjoy the
> benefit that PostgreSQL handles collations in a corret manner on any
> platform which PostgreSQL supports.

You say building our own locale data is not hard. I disagree, it's a
waste of time we can do without. Unless you know the language yourself
you cannot check changes made by anybody else. If there's an error in
locale ordering, take it up with your OS distributor.

I also think we open ourselves to questions like:

1. My locale is supported by the system but not by PostgreSQL, why?
2. My locale was supported last release but not this one, why?
3. Why does PostgreSQL sort differently from 'sort' or any other app on
my system?

> Right. We Japanese (and probably Chinese too) have been bugged by the
> broken mutibyte locales for long time. Using C locale help us to a
> certain extent, but for Unicode we need correct locale data, othewise
> the sorted data will be completely chaos.

Ok, is glibc still wrong or are they just implementing the unicode
standard and that's what's wrong.

All I'm saying is that we need to allow use of system locales until our
native locale support is mature. In the end something like ICU
(http://icu.sourceforge.net/) will end up obsoleting us. Nobody (in
free-software anyway) uses it yet, but eventually it may be viable to
require that to allow system independant locales.
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>
Subject: Re: Proof of concept COLLATE support with patch
Date: 2005-09-04 17:06:57
Message-ID: 200509041906.57721.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Martijn van Oosterhout wrote:
> This is just a proof of concept patch. I didn't send it to -patches
> because as Tom pointed out, there's no hope of it getting in due to
> platform dependant behaviour.

I think it would be best if we defined an internal API for plugging in
various kinds of locale support. Then you can hook in this
"newlocale", the Windows variant, ICU, or plain-old POSIX locale
support for backward compatibility. You already identified most of the
API functions.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: kleptog(at)svana(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us, gsstark(at)mit(dot)edu, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Locale implementation questions
Date: 2005-09-04 19:02:45
Message-ID: 87fysky7u2.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> writes:

> To be honest, I don't understand why we have to rely on (often broken)
> system locales. I don't think building our own locale data is too
> hard, and once we make up it, the maintenace cost will be very small
> since it should not be changed regularly. Moreover we could enjoy the
> benefit that PostgreSQL handles collations in a corret manner on any
> platform which PostgreSQL supports.

I think it's sheer madness to try to reproduce large swaths of the OS inside
Postgres because you're unhappy with the quality of the OS implementation. You
should be asking yourself why OS vendors have such a hard time getting this
stuff right and why would Postgres do any better. Wouldn't that work be better
spent improving the database functionality of Postgres?

Or at least better spent improving the locale support for the entire OS? It
would be positively awful if every application on my system had its own locale
database each of which had its own set of bugs and its own feature set.

--
greg


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proof of concept COLLATE support with patch
Date: 2005-09-04 22:06:11
Message-ID: 8946.1125871571@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> But there's nothing wrong with saying "it's slow because your
> libc is slow. Compile with this freely available library which has a better
> implementation".

The hole in that argument is the assumption that there *is* a freely
available library that can be used (where freely == BSD license).
We wouldn't be having this discussion if we knew of one.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>
Subject: Re: Proof of concept COLLATE support with patch
Date: 2005-09-04 23:15:21
Message-ID: 9398.1125875721@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> I think it would be best if we defined an internal API for plugging in
> various kinds of locale support.

Agreed ...

> Then you can hook in this
> "newlocale", the Windows variant, ICU, or plain-old POSIX locale
> support for backward compatibility.

If plain old POSIX actually did what we needed, we likely wouldn't be
having this discussion at all. POSIX doesn't give us enough visibility
of the locale's properties (in particular, which character set encoding
it wants). The performance penalties it imposes are pretty bad also,
though arguably secondary.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, kleptog(at)svana(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Locale implementation questions
Date: 2005-09-04 23:19:38
Message-ID: 9440.1125875978@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> I think it's sheer madness to try to reproduce large swaths of the OS
> inside Postgres because you're unhappy with the quality of the OS
> implementation. You should be asking yourself why OS vendors have such
> a hard time getting this stuff right

In the case of the *BSDs, it's pretty obviously because they don't care.

> and why would Postgres do any better

In the first place, we do care, and in the second place, having to deal
with only one set of locale bugs would in itself be a huge advance over
where we are now.

We went over to maintaining our own timezone code for more or less the
same reasons, and in hindsight that was obviously the right decision.
Locale support is a bigger chunk, no doubt about it, but we also have
a lot of motivation.

regards, tom lane


From: Petr Jelinek <pjmodos(at)seznam(dot)cz>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept COLLATE support with patch
Date: 2005-09-04 23:52:45
Message-ID: 431B88CD.8050904@seznam.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
>
> The hole in that argument is the assumption that there *is* a freely
> available library that can be used (where freely == BSD license).
> We wouldn't be having this discussion if we knew of one.

I see this discussion as another reason to use ICU, I mean complete
rewrite of locale handling to use ICU on all platforms. I know it's big
project but it's doable for 8.2 and it would virtually solve all locale
problems and could be base for new unicode/locale features. I am not
sure if this is the way postgres wants to go tho (having dependency on
such a big and uncommon library).

--
Regards
Petr Jelinek (PJMODOS)


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proof of concept COLLATE support with patch
Date: 2005-09-05 07:51:52
Message-ID: 20050905075152.GA5278@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Sep 04, 2005 at 06:06:11PM -0400, Tom Lane wrote:
> Greg Stark <gsstark(at)mit(dot)edu> writes:
> > But there's nothing wrong with saying "it's slow because your
> > libc is slow. Compile with this freely available library which has a better
> > implementation".
>
> The hole in that argument is the assumption that there *is* a freely
> available library that can be used (where freely == BSD license).
> We wouldn't be having this discussion if we knew of one.

1. Use the one in Darwin just for the *BSDs and Solaris at least. It's
not great but it would probably work.

2. Long term, transition to ICU (http://icu.sourceforge.net/) which is
the cross-platform internationalisation library used by Java. Looks
like Mono and Gnome/GTK are going to use this (or at least allow use
of) soon also. It uses the X licence AFAICS. It's a big pill right now
but it a year it could be installed standard on most linux systems.
It's at least avaiable everywhere now.

Note, it's not compatable with POSIX locales so if we go there it'll be
an all or nothing switch. But if we intend to go there eventually, it
makes fiddling on our own library a waste of time.

Incidently, I played with the code in Darwin and getting it to compile
on a system that already has extended locale support is, uh, tricky to
say the least. Lots of conflicting definitions.
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Patrick Welche <prlw1(at)newn(dot)cam(dot)ac(dot)uk>
To: Petr Jelinek <pjmodos(at)seznam(dot)cz>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept COLLATE support with patch
Date: 2005-09-05 18:42:16
Message-ID: 20050905184216.GT8469@quartz.itdept.newn.cam.ac.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Sep 05, 2005 at 01:52:45AM +0200, Petr Jelinek wrote:
> Tom Lane wrote:
> >
> >The hole in that argument is the assumption that there *is* a freely
> >available library that can be used (where freely == BSD license).
> >We wouldn't be having this discussion if we knew of one.
>
> I see this discussion as another reason to use ICU, I mean complete
> rewrite of locale handling to use ICU on all platforms. I know it's big
> project but it's doable for 8.2 and it would virtually solve all locale
> problems and could be base for new unicode/locale features. I am not
> sure if this is the way postgres wants to go tho (having dependency on
> such a big and uncommon library).

Maybe not so uncommon...

% ldd /usr/local/bin/php
/usr/local/bin/php:
...
-lresolv.1 => /usr/lib/libresolv.so.1
-lpq.4 => /usr/local/pgsql/lib/libpq.so.4
-lintl.0 => /usr/lib/libintl.so.0
-licudata.34 => /usr/local/lib/libicudata.so.34
-licuuc.34 => /usr/local/lib/libicuuc.so.34
-licui18n.34 => /usr/local/lib/libicui18n.so.34
-licuio.34 => /usr/local/lib/libicuio.so.34
...

Cheers,

Patrick


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proof of concept COLLATE support with patch
Date: 2006-06-14 18:49:12
Message-ID: 200606141849.k5EInCC17553@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Thread added to TODO.detail.

---------------------------------------------------------------------------

Greg Stark wrote:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>
> > Greg Stark <gsstark(at)mit(dot)edu> writes:
> > > I still doesn't get where the hostility towards this functionality comes from.
> >
> > We're not really willing to say "here is a piece of syntax REQUIRED
> > BY THE SQL SPEC which we only support on some platforms". readline,
> > O_DIRECT, and the like are a completely inappropriate analogy, because
> > those are inherently platform-dependent (and not in the spec).
>
> But that's not the case at all. The syntax can be supported everywhere it
> would just be somewhat faster on some platforms than others. It's already
> reasonably fast on any platform that caches locale information which includes
> glibc and presumably other free software libcs. It would be slightly faster if
> there are _l functions. And much slower if the libc setlocale implementation
> is braindead. But there's nothing wrong with saying "it's slow because your
> libc is slow. Compile with this freely available library which has a better
> implementation". The programming syntax would still be exactly 100% the same.
>
> > The objection is fundamentally that a platform-specific implementation
> > cannot be our long-term goal, and so expending effort on creating one
> > seems like a diversion. If there were a plan put forward showing how
> > this is just a useful way-station, and we could see how we'd later get
> > rid of the glibc dependency without throwing away the work already done,
> > then it would be a different story.
>
> It's not like the actual calls to setlocale are going to be much code. One day
> presumably some variant of these _l functions will become entirely standard.
> In which case you're talking about potentially "throwing away" 50 lines of
> code. The bulk of the code is going to be parsing and implementing the actual
> syntax and behaviour of the SQL spec. And in any case I wouldn't expect it to
> ever get thrown away. There will be people compiling on RH9 or similar vintage
> systems for a long time.
>
> --
> greg
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: kleptog(at)svana(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us, gsstark(at)mit(dot)edu, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Locale implementation questions
Date: 2006-06-14 18:49:25
Message-ID: 200606141849.k5EInPj17579@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Thead added to TODO.detail.

---------------------------------------------------------------------------

Tatsuo Ishii wrote:
> > 3. Compiled locale files are large. One UTF-8 locale datafile can
> > exceed a megabyte. Do we want the option of disabling it for small
> > systems?
>
> To avoid the problem, you could dynmically load the compiled
> tables. The charset conversion tables are handled similar way.
>
> Also I think it's important to allow user defined collate data. To
> implement the CREATE COLLATE syntax, we need to have that capability
> anyway.
>
> > 4. Do we want the option of running system locale in parallel with the
> > internal ones?
> >
> > 5. I think we're going to have to deal with the very real possibility
> > that our locale database will not be as good as some of the system
> > provided ones. The question is how. This is quite unlike timezones
> > which are quite standardized and rarely change. That database is quite
> > well maintained.
> >
> > Would people object to a configure option that selected:
> > --with-locales=internal (use pg database)
> > --with-locales=system (use system database for win32, glibc or MacOS X)
> > --with-locales=none (what we support now, which is neither)
> >
> > I don't think it will be much of an issue to support this, all the
> > functions take the same parameters and have almost the same names.
>
> To be honest, I don't understand why we have to rely on (often broken)
> system locales. I don't think building our own locale data is too
> hard, and once we make up it, the maintenace cost will be very small
> since it should not be changed regularly. Moreover we could enjoy the
> benefit that PostgreSQL handles collations in a corret manner on any
> platform which PostgreSQL supports.
>
> > 6. Locales for SQL_ASCII. Seems to me you have two options, either
> > reject COLLATE altogether unless they specify a charset, or don't care
> > and let the user shoot themselves in the foot if they wish...
> >
> > BTW, this MacOS locale supports seems to be new for 10.4.2 according to
> > the CVS log info, can anyone confirm this?
> >
> > Anyway, I hope this post didn't bore too much. Locale support has been
> > one of those things that has bugged me for a long time and it would be
> > nice if there could be some real movement.
>
> Right. We Japanese (and probably Chinese too) have been bugged by the
> broken mutibyte locales for long time. Using C locale help us to a
> certain extent, but for Unicode we need correct locale data, othewise
> the sorted data will be completely chaos.
> --
> SRA OSS, Inc. Japan
> Tatsuo Ishii
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +