Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function)

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function)
Date: 2010-10-04 20:19:18
Message-ID: AANLkTinL406w209Nh5MSxjyAVVh3UTGWknpL_VQAgWMk@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've reworked patch with your suggestion. In this version I found a little
slowdown in comparison with previous version:

SELECT * FROM words WHERE levenshtein_less_equal(a, 'extensize', 2) <= 2;
48,069 ms => 57,875 ms
SELECT * FROM words2 WHERE levenshtein_less_equal(a, 'клубничный', 3) <= 2;
100,073 ms => 113,975 ms
select * from phrases where levenshtein_less_equal('kkkknucklehead
courtliest sapphires be coniferous emolument antarctic Laocoon''s deadens
unseemly', a, 10) <= 10;
22,876 ms => 24,721 ms
test=# select * from phrases2 where levenshtein_less_equal('таяй
раскупорившийся передислоцируется юлианович праздничный лачужка присыхать
опппливший ффехтовальный уууудобряющий', a, 10) <= 10;
55,405 ms => 57,760 ms

I think it is caused by multiplication operation for each bound
movement. Probably,
this slowdown is ignorable or there is some way to achieve the same
performance.

----
With best regards,
Alexander Korotkov.

Attachment Content-Type Size
levenshtein_less_equal-0.3.patch text/x-patch 15.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Leonardo Francalanci 2010-10-04 20:47:30 Re: I: About "Our CLUSTER implementation is pessimal" patch
Previous Message Kevin Grittner 2010-10-04 19:57:29 Re: ALTER DATABASE RENAME with HS/SR