Re: Controlling locale and impact on LIKE statements

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Martin Langhoff <martin(dot)langhoff(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Controlling locale and impact on LIKE statements
Date: 2007-09-06 12:57:43
Message-ID: 20070906125743.GD6186@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Martin Langhoff escribió:
> On 9/5/07, Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:
> > Martin Langhoff escribió:
> >
> > > As I have a Pg install where the locale is already en_US.UTF-8, and
> > > the database already exists, is there a DB-scoped way of controlling
> > > the locale?
> >
> > Not really.
>
> Ah well. But I do have to wonder why... if each database can have its
> own encoding, that is likely to be matched with a locale. Isn't that
> the main usage scenario? In fact, with unicode encodings, it's likely
> that all your DBs are utf-8 encoded, but each may have its own locale.

The problem is twofold:

1. index ordering is dependent on locale, and
2. there are some indexes over text columns on shared tables, that is,
tables to are in all databases (pg_database, pg_authid, etc).

So you cannot really change the locale without making those indexes
invalid. It has been said in the past that it is possible to work
around this, which would allow us to change locale per database, but it
hasn't gotten done yet.

> And yet, right now it's all affected by the locale the cluster was
> init'd under. In my case, old Pg installations have been upgraded a
> few times from a Debian Sarge (C locale). Newer DB servers based on
> ubuntu are getting utf-8-ish locales. And all this variation is
> impacting something that should be per DB...
>
> Is this too crazy to ask? ;-)

Well, you are not the only one to have asked this, so it's probably not
crazy. It just hasn't gotten any hacker motivated enough yet, though.

> > You are right and Eloy is wrong on that discussion. There is not
> > anything the DB can do to use the regular index if the locale is not C
> > for LIKE queries. There are good reasons for this. There's not much
> > option beyond creating the pattern_ops index.
>
> Are the reasons *really* good? ;-)

Well, I can't remember them ATM :-) But this was given deep
consideration and the pattern_ops were the best solution to be found.

--
Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC
"Industry suffers from the managerial dogma that for the sake of stability
and continuity, the company should be independent of the competence of
individual employees." (E. Dijkstra)

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Lincoln Yeoh 2007-09-06 13:06:06 Re: Need suggestion on how best to update 3 million rows
Previous Message Paul Tilles 2007-09-06 12:51:54 Version 8.2.4 ecpg - function not found