Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)
Date: 2011-06-24 18:24:21
Message-ID: BANLkTimeqvdo6o0vZcc90F6FGRuCZM0p0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 24, 2011 at 12:51 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> On 24.06.2011 11:40, Heikki Linnakangas wrote:
>>
>> On 21.06.2011 13:08, Alexander Korotkov wrote:
>>>
>>> I believe it's due to relatively expensive penalty
>>> method in that opclass.
>>
>> Hmm, I wonder if it could be optimized. I did a quick test, creating a
>> gist_trgm_ops index on a list of English words from
>> /usr/share/dict/words. oprofile shows that with the patch, 60% of the
>> CPU time is spent in the makesign() function.
>
> I couldn't resist looking into this, and came up with the attached patch. I
> tested this with:
>
> CREATE TABLE words (word text);
> COPY words FROM '/usr/share/dict/words';
> CREATE INDEX i_words ON words USING gist (word gist_trgm_ops);
>
> And then ran "REINDEX INDEX i_words" a few times with and without the patch.
> Without the patch, reindex takes about 4.7 seconds. With the patch, 3.7
> seconds. That's a worthwhile gain on its own, but becomes even more
> important with Alexander's fast GiST build patch, which calls the penalty
> function more.
>
> I used the attached showsign-debug.patch to verify that the patched makesign
> function produces the same results as the existing code. I haven't tested
> the big-endian code, however.

Out of curiosity (and because there is no comment or Assert here), how
can you be so sure of the input alignment?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2011-06-24 18:47:46 Re: Deriving release notes from git commit messages
Previous Message Robert Haas 2011-06-24 18:20:32 Re: crash-safe visibility map, take five