Re: Cube extension kNN support

From: Stas Kelvich <stas(dot)kelvich(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>, obartunov(at)gmail(dot)com
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Cube extension kNN support
Date: 2013-12-03 23:10:37
Message-ID: F1F7C601-8848-46DF-9E2D-A9004A7B8033@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi.

> cube and earthdistance regression tests fail.

Code updated to work with current HEAD. Also added tests to cover new functionality.

> Do you have any benchmarks ?

This patch just introduces functionality of calculating distances between cubes, so this code don't interfere much with kNN search speed. I think it's better to publish such benchmarks in neighbor patch about split algorithm.
Anyway, we can compare kNN with b-tree and full scan:

create table test(a1 float, a2 float, a3 float);
insert into test (select 100*random(), 100*random(), 100*random() from generate_series(1,1000000) as s(a));
create index on test using gist(cube(array[a1,a2,a3]));
select * from test order by a1 limit 15; -- 227.658 ms
select * from test order by cube(array[a1,a2,a3])->1 limit 15; -- 1.275 ms
create index on test(a1);
select * from test order by a1 limit 15; -- 0.103 ms

As we can see, kNN ordering 10 times slower than B-tree (on silly request for R-Tree, just as example), but still 100+ times faster than full scan on this table.

Stas.

On Sep 25, 2013, at 5:25 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:

> On 9/22/13 7:38 PM, Stas Kelvich wrote:
>> Here is the patch that introduces kNN search for cubes with euclidean, taxicab and chebyshev distances.
>
> cube and earthdistance regression tests fail.

Attachment Content-Type Size
distances.patch application/octet-stream 45.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-12-03 23:12:27 Re: pgsql: Fix a couple of bugs in MultiXactId freezing
Previous Message Tom Lane 2013-12-03 23:08:11 Re: Why we are going to have to go DirectIO