Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function)
Date: 2010-10-08 01:52:26
Message-ID: AANLkTikZsCsFnzjzhggbV9brfPrn+nYnZo1D=2i3VktH@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2010/10/4 Alexander Korotkov <aekorotkov(at)gmail(dot)com>:
> I've reworked patch with your suggestion. In this version I found a little
> slowdown in comparison with previous version:
> SELECT * FROM words WHERE levenshtein_less_equal(a, 'extensize', 2) <= 2;
> 48,069 ms => 57,875 ms
> SELECT * FROM words2 WHERE levenshtein_less_equal(a, 'клубничный', 3) <= 2;
> 100,073 ms => 113,975 ms
> select * from phrases where levenshtein_less_equal('kkkknucklehead
> courtliest   sapphires be coniferous emolument antarctic Laocoon''s deadens
> unseemly', a, 10) <= 10;
> 22,876 ms => 24,721 ms
> test=# select * from phrases2 where levenshtein_less_equal('таяй
> раскупорившийся передислоцируется юлианович праздничный лачужка присыхать
> опппливший ффехтовальный уууудобряющий', a, 10) <= 10;
> 55,405 ms => 57,760 ms
> I think it is caused by multiplication operation for each bound
> movement. Probably, this slowdown is ignorable or there is some way
> to achieve the same performance.

This patch doesn't apply cleanly. It also seems to revert some recent
commits to fuzzystrmatch.c. Can you please send a corrected version?

[rhaas pgsql]$ patch -p1 < ~/Downloads/levenshtein_less_equal-0.3.patch
patching file contrib/fuzzystrmatch/fuzzystrmatch.c
Reversed (or previously applied) patch detected! Assume -R? [n]
Apply anyway? [n] y
Hunk #1 FAILED at 5.
Hunk #8 FAILED at 317.
Hunk #9 succeeded at 543 (offset 10 lines).
Hunk #10 succeeded at 567 (offset 10 lines).
Hunk #11 succeeded at 578 (offset 10 lines).
2 out of 11 hunks FAILED -- saving rejects to file
contrib/fuzzystrmatch/fuzzystrmatch.c.rej
patching file contrib/fuzzystrmatch/fuzzystrmatch.sql.in
patching file doc/src/sgml/fuzzystrmatch.sgml

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2010-10-08 01:52:33 Re: Git cvsserver serious issue
Previous Message Tom Lane 2010-10-08 01:49:23 Re: I: About "Our CLUSTER implementation is pessimal" patch