Controlling locale and impact on LIKE statements

From: Martin Langhoff <martin(at)catalyst(dot)net(dot)nz>
To: pgsql-general(at)postgresql(dot)org
Subject: Controlling locale and impact on LIKE statements
Date: 2007-09-05 00:48:19
Message-ID: 46DDFCD3.1060408@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi!

I am having a bit of trouble with indexes, locales and LIKE queries.

Background
----------

Using Pg8.1/8.2 on a utf-8 database, my left-anchored LIKE clauses were
forcing a full table scan instead of using the index. After a bit of
digging, I found that Pg can only use the "normal" index for
left-anchored LIKE queries if locale is 'C'.

From http://www.postgresql.org/docs/8.1/static/indexes-types.html :
> The optimizer can also use a B-tree index for queries involving the
> pattern matching operators LIKE and ~ if the pattern is a constant and
> is anchored to the beginning of the string — for example, col LIKE
> 'foo%' or col ~ '^foo', but not col LIKE '%bar'. However, if your
> server does not use the C locale you will need to create the index
> with a special operator class to support indexing of pattern-matching
> queries.

What I think I need to do
-------------------------

As I have a Pg install where the locale is already en_US.UTF-8, and
the database already exists, is there a DB-scoped way of controlling
the locale? I think the index usage noted above is affected by
lc_ctype but I could be wrong.

I really don't want to go down the "rebuild your pgcluster" path as
outlined here
http://archives.postgresql.org/pgsql-hackers/2007-02/msg00992.php
;-)

Is there a better way? In this specific install I can create the
additional index...

However, this needs a general fix for Moodle, which
has an abstract DB schema handling, as we support MySQL, Pg, MSSQL,
Oracle. The whole thing of figuring out what the locale is and
whether to add magical additional indexes just for Pg makes me look
like a loony.

New PostgreSQL installs on modern linuxen like Ubuntu default to non-C
locales, which makes this more of an issue going forward.

See the discussion with Eloy (maintainer of the schema abstraction
layer) at http://moodle.org/mod/forum/discuss.php?d=78738#p350512
login as "guest" to avoid registration.

cheers,

martin
--
-----------------------------------------------------------------------
Martin @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington
WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St
NZ: +64(4)916-7224 MOB: +64(21)364-017 UK: 0845 868 5733 ext 7224
Make things as simple as possible, but no simpler - Einstein
-----------------------------------------------------------------------

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2007-09-05 02:23:43 Re: Controlling locale and impact on LIKE statements
Previous Message Martin Langhoff 2007-09-05 00:40:59 Controlling locale and impact on LIKE statements