Re: WIP patch: Collation support

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, Radek Strnad <radek(dot)strnad(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch: Collation support
Date: 2008-09-22 07:22:35
Message-ID: 48D747BB.8020407@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Martijn van Oosterhout wrote:
> On Fri, Sep 19, 2008 at 10:13:43AM +0300, Heikki Linnakangas wrote:
>> It's not like the patch is going to disappear from planet Earth if it
>> doesn't get committed for 8.4. It's still valuable and available when
>> the new catalogs are needed.
>
> I just prefer it as it was because it takes care of a useful subset of
> the features people want in a way that is compatable for the future.
> Whereas the stripped down version, I'm not sure it gets us anywhere.

It gives the capability to have different collations in different
databases within the same cluster. IOW, the same feature as the original
patch. Finer-grained collation would be even better, of course, but
database-level collations is a valuable feature on its own.

The critical question is how much compatibility trouble we're going to
get by having to support the extension to CREATE DATABASE in the
stripped-down patch, when the pg_collation catalog is introduced in a
later version in one form or another. So let's investigate that a bit
further:

In the stripped down version, the CREATE DATABASE syntax is:

CREATE DATABASE <name> WITH COLLATE=<locale name> CTYPE=<locale name>

In the original patch, the CREATE DATABASE syntax is:

CREATE DATABASE <name> WITH COLLATE=<collation name>

The first thing that we see is that the COLLATE keyword means different
things, so it's probably best to change that into:

CREATE DATABASE <name> WITH LC_COLLATE=<locale name> LC_CTYPE=<locale name>

in the stripped-down version. Then we need a way to map the
stripped-down syntax into the one in the original patch. That's just a
matter of looking up the collation in the pg_collation catalog with the
right LC_COLLATE and LC_CTYPE.

Things get slightly more complicated if there is no such collation in
the pg_collation catalog. One option is to simply create it at that point.

BTW, the original patch didn't have any provision for creating rows in
pg_collation reflecting the locales available in the OS, but I think
we'd need that. Otherwise the DBA would need to manually run CREATE
COLLATION for every collation they want users to be able to use.
Assuming we do that, the situation that we can't find a row with given
LC_COLLATE and LC_CTYPE should not arise in practice.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Asko Oja 2008-09-22 07:26:48 Re: Proposal: move column defaults into pg_attribute along with attacl
Previous Message Dave Page 2008-09-22 07:20:35 Re: Where to Host Project