Re: Per-column collation

From: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Per-column collation
Date: 2010-12-06 12:06:27
Message-ID: AANLkTi=427R7dBeGs8qdd-YO9-LXu0ab+HSFQGQ6y17+@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Dec 5, 2010 at 01:04, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> Here is an updated patch to address the issues discussed during this
> commitfest.

Here are comments and questions after I tested the latest patch:

==== Issues ====
* initdb itself seems to be succeeded, but it says "could not determine
encoding for locale" messages for any combination of encoding=utf8/eucjp
and locale=ja_JP.utf8/ja_JP.eucjp/C. Is it an expected behavior?
----
creating collations ...initdb: locale name has non-ASCII characters,
skipped: bokm虱
initdb: locale name has non-ASCII characters, skipped: fran軋is
could not determine encoding for locale "hy_AM.armscii8": codeset is "ARMSCII-8"
... (a dozen of lines) ...
could not determine encoding for locale "vi_VN.tcvn": codeset is "TCVN5712-1"
ok
----

* contrib/citext raises an encoding error when COLLATE is specified
even if it is the collation as same as the database default.
We might need some special treatment for C locale.
=# SHOW lc_collate; ==> C
=# SELECT ('A'::citext) = ('a'::citext); ==> false
=# SELECT ('A'::citext) = ('a'::citext) COLLATE "C";
ERROR: invalid multibyte character for locale
HINT: The server's LC_CTYPE locale is probably incompatible with the
database encoding.

* pg_dump would generate unportable files for different platforms
because collation names

==== Source codes ====
* PG_GETARG_COLLATION might be a better name rather than PG_GET_COLLATION.

* What is the different between InvalidOid and DEFAULT_COLLATION_OID
for collation oids? The patch replaces DirectFunctionCall to
DirectFunctionCallC in some places, but we could shrink the diff size
if we can use InvalidOid instead of DEFAULT_COLLATION_OID,

* I still think an explicit passing collations from-function-to-function
is horrible because we might forget it in some places, and almost existing
third party module won't work. Is it possible to make it a global variable,
and push/pop the state when changed? Sorry I'm missing something, but
I think we could treat the collation setting as like as GUC settings.

--
Itagaki Takahiro

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2010-12-06 12:47:02 Re: [BUGS] BUG #5662: Incomplete view
Previous Message Heikki Linnakangas 2010-12-06 11:39:02 Re: Comment typo in xlog.c