Regexps vs. locale

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Regexps vs. locale
Date: 2008-12-08 08:11:58
Message-ID: 87ljurozld.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This came up on irc:

postgres=# show lc_ctype;
lc_ctype
-------------
fr_FR.UTF-8

postgres=# show server_encoding;
server_encoding
-----------------
UTF8
(1 row)

postgres=# select E'\303\201' ILIKE E'\303\241';
?column?
----------
t
(1 row)

postgres=# select E'\303\201' ~* E'\303\241';
?column?
----------
f
(1 row)

Obviously, this happens because the locale support functions in
backend/regex/regc_locale.c are (presumably intentionally) crippled so
as not to support non-ascii chars, despite all the code there using
wide chars for everything otherwise.

Why is this? It does not appear to be a documented restriction.

--
Andrew (irc:RhodiumToad)

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2008-12-08 08:59:45 Re: Multiplexing SUGUSR1
Previous Message Heikki Linnakangas 2008-12-08 08:04:24 Multiplexing SUGUSR1