Re: ToDo: KNN Search should to support DISTINCT clasuse?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ToDo: KNN Search should to support DISTINCT clasuse?
Date: 2012-10-22 15:57:35
Message-ID: 2448.1350921455@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> but using DISTINCT breaks KNN searching optimization

> postgres=# explain select distinct nazobce, nazobce <-> 'Beneov' from
> obce order by nazobce <-> 'Beneov' limit 10

Don't hold your breath. There are two ways the system could implement
the DISTINCT clause: either sort and uniq, or hashaggregate.
hashaggregate will destroy any input ordering, so there's no value in
using the index as input. sort and uniq requires the input to be sorted
by *all* the columns being distinct'ed, not just one, so again this
index isn't useful. You could get a plan using the index if you only
wanted the <-> output column, eg

contrib_regression=# explain select distinct t <-> 'foo' from test_trgm order by t <-> 'foo' limit 10;
QUERY PLAN
-------------------------------------------------------------------------------------
Limit (cost=0.00..0.87 rows=10 width=12)
-> Unique (cost=0.00..86.75 rows=1000 width=12)
-> Index Scan using ti on test_trgm (cost=0.00..84.25 rows=1000 width=12)
Order By: (t <-> 'foo'::text)
(4 rows)

Perhaps it would be close enough to what you want to use DISTINCT ON:

contrib_regression=# explain select distinct on( t <-> 'foo') *,t <-> 'foo' from test_trgm order by t <-> 'foo' limit 10;
QUERY PLAN
-------------------------------------------------------------------------------------
Limit (cost=0.00..0.87 rows=10 width=12)
-> Unique (cost=0.00..86.75 rows=1000 width=12)
-> Index Scan using ti on test_trgm (cost=0.00..84.25 rows=1000 width=12)
Order By: (t <-> 'foo'::text)
(4 rows)

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2012-10-22 16:02:00 Re: ToDo: KNN Search should to support DISTINCT clasuse?
Previous Message Phil Sorber 2012-10-22 15:47:13 Re: [WIP] pg_ping utility