From: | Marcin Mańk <marcin(dot)mank(at)gmail(dot)com> |
---|---|
To: | Gordon Mohr <gojomo-pgsql(at)xavvy(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: high-dimensional knn-GIST tests (was Re: Cube extension kNN support) |
Date: | 2013-10-27 20:43:54 |
Message-ID: | CAK61fk4gh8qRc_0+yig4VnjCPpizUt-dq=dguxUVQ-D=Ztx_Ng@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Oct 24, 2013 at 3:50 AM, Gordon Mohr <gojomo-pgsql(at)xavvy(dot)com> wrote:
> On 9/22/13 4:38 PM, Stas Kelvich wrote:
>
>> Hello, hackers.
>>
>> Here is the patch that introduces kNN search for cubes with
>> euclidean, taxicab and chebyshev distances.
>>
>
> Thanks for this! I decided to give the patch a try at the bleeding edge
> with some high-dimensional vectors, specifically the 1.4 million
> 1000-dimensional Freebase entity vectors from the Google 'word2vec' project:
>
I believe the curse of dimensionality is affecting you here. I think it is
impossible to get an improvement over sequential scan for 1000 dimensional
vectors. Read here:
http://en.wikipedia.org/wiki/Curse_of_dimensionality#k-nearest_neighbor_classification
Regards
Marcin Mańk
From | Date | Subject | |
---|---|---|---|
Next Message | Craig Ringer | 2013-10-28 01:51:00 | Re: CLUSTER FREEZE |
Previous Message | Pavel Stehule | 2013-10-27 09:40:29 | Re: proposal: lob conversion functionality |