Quick Links

Re: Per-column collation, proof of concept

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Per-column collation, proof of concept
Date:	2010-07-15 15:24:21
Message-ID:	5575.1279207461@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> Well, the comparison function varstr_cmp() contains this comment:

> /*
> * In some locales strcoll() can claim that nonidentical strings are
> * equal. Believing that would be bad news for a number of reasons,
> * so we follow Perl's lead and sort "equal" strings according to
> * strcmp().
> */

> This might not be strictly necessary, seeing that citext obviously
> doesn't work that way, but resolving this is really an orthogonal issue.

The problem with not doing that is it breaks hashing --- hash joins and
hash aggregation being the real pain points.

citext works around this in a rather klugy fashion by decreeing that two
strings are equal iff their str_tolower() conversions are bitwise equal.
So it can hash the str_tolower() representation. But that's kinda slow
and it fails in the general case anyhow, I think.

regards, tom lane

In response to

Re: Per-column collation, proof of concept at 2010-07-15 08:44:26 from Peter Eisentraut

Responses

Re: Per-column collation, proof of concept at 2010-07-15 17:04:19 from Greg Stark

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Thom Brown	2010-07-15 15:30:47	Re: SHOW TABLES
Previous Message	Simon Riggs	2010-07-15 15:20:12	Re: SHOW TABLES