From: | Greg Stark <stark(at)mit(dot)edu> |
---|---|
To: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: sortsupport for text |
Date: | 2012-03-20 01:20:16 |
Message-ID: | CAM-w4HMg-f9Suk+Fou7gY9aF8s2saw43GNbCcdbB2+PYSE8c6w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Mar 19, 2012 at 9:23 PM, Martijn van Oosterhout
<kleptog(at)svana(dot)org> wrote:
> Ouch. I was holding out hope that you could get a meaningful
> improvement if we could use the first X bytes of the strxfrm output so
> you only need to do a strcoll on strings that actually nearly match.
> But with an information density of 9 bytes for one 1 character it
> doesn't seem worthwhile.
When I was playing with glibc it was 4n. I think what they do is have
n bytes for the high order bits, then n bytes for low order bits like
capitalization or whitespace differences. I suspect they used to use
16 bits for each and have gone to some larger size.
> That and this gem in the strxfrm manpage:
>
> RETURN VALUE
> The strxfrm() function returns the number of bytes required to
> store the transformed string in dest excluding the terminating
> '\0' character. If the value returned is n or more, the
> contents of dest are indeterminate.
>
> Which means that you have to take the entire transformed string, you
> can't just ask for the first bit. I think that kind of leaves the whole
> idea dead in the water.
I believe the intended API is that you allocate a buffer with your
guess of the right size, call strxfrm and if it returns a larger
number you realloc your buffer and call it again.
--
greg
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Farina | 2012-03-20 01:36:54 | Re: Gsoc2012 Idea --- Social Network database schema |
Previous Message | Jeff Davis | 2012-03-20 01:19:23 | Re: Incorrect assumptions with low LIMITs |