Re: Multilingual application, ORDER BY w/ different locales?

From: Hannu Krosing <hannu(at)sid(dot)tm(dot)ee>
To: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Palle Girgensohn <girgen(at)partitur(dot)se>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multilingual application, ORDER BY w/ different locales?
Date: 2001-11-18 06:56:46
Message-ID: 3BF75BAE.1070905@sid.tm.ee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Stephan Szabo wrote:

>On Sat, 17 Nov 2001, Tom Lane wrote:
>
>>Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com> writes:
>>
>>>Would it be possible to make a function in plpgsql or whatever that
>>>wrapped the collate changes and then order by that and make functional
>>>indexes? Would the system use it?
>>>
>>IIRC, we were debating whether we should consider collation to be an
>>attribute of the datatype (think typmod) or an attribute of individual
>>values (think field added to values of textual types). In the former
>>case, a function like this would only work if we allowed its result to
>>be declared as having the right collate attribute. Which is not
>>impossible, but we don't currently associate any typmod with function
>>arguments or results, and so I'm not sure how painful it would be.
>>With the field-in-data-value approach it's easy to see how it would
>>work. But another byte or word per text value might be a high price
>>to pay ...
>>
>
>True. Although I wonder how things like substring would work in the
>model with typmods if the collation isn't attached in any fashion to
>the return values since I think the substring collation is supposed
>to be the same as the input string's, whereas for something like
>convert it's a different collation based on a parameter. I wonder if
>as a temporary thing, you could use a function that did something
>similar to strxfrm as long as you only used that for sorting purposes.
>
That would mean a new datatype that such function returns

CREATE FUNCTION text_with_collation(text,collation) RETURNS
text_with_collation

That would be sorted using the rules of that collation.

This can currently be added in contrib, but should eventually go into core.

The function itself is quite easy, but the collation is the part that
can either be done by
a) writing our own library

b) using system locale (i think that locale switching is slow in default
glibc , so the
following can be slow too
ORDER BY text_with_collation(t1,'et_EE'), text_with_collation(t1,'fr_CA')
but I doubt anybody uses it.

c) using a third party library - at least IBM has one which is almost as
big as whole postgreSQL ;)

assuming that one backend needs mostl one locale at a time, I think that
b) will be the easiest to
implement, but this will clash with current locale support if it is
compiled in so you have to be
rapidly swithcing LC_COLLATE between the default and that of the current
datum.

so what we actually need is a system that will _not_ use locale-aware
functions unless specifically
told to do so by feeding it with text_with_locale values.

---------------
Hannu

----------------
Hannu

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2001-11-18 07:24:45 Re: Multilingual application, ORDER BY w/ different
Previous Message Tom Lane 2001-11-18 06:40:37 Re: OCTET_LENGTH is wrong