Re: ToDo: KNN Search should to support DISTINCT clasuse?

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ToDo: KNN Search should to support DISTINCT clasuse?
Date: 2012-10-22 16:02:00
Message-ID: CAFj8pRDrrUUx6WiZME8CjsROma6LKVkKx50_rZqLq9om=JTVpg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2012/10/22 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> but using DISTINCT breaks KNN searching optimization
>
>> postgres=# explain select distinct nazobce, nazobce <-> 'Benešov' from
>> obce order by nazobce <-> 'Benešov' limit 10
>
> Don't hold your breath. There are two ways the system could implement
> the DISTINCT clause: either sort and uniq, or hashaggregate.
> hashaggregate will destroy any input ordering, so there's no value in
> using the index as input. sort and uniq requires the input to be sorted
> by *all* the columns being distinct'ed, not just one, so again this
> index isn't useful. You could get a plan using the index if you only
> wanted the <-> output column, eg
>
> contrib_regression=# explain select distinct t <-> 'foo' from test_trgm order by t <-> 'foo' limit 10;
> QUERY PLAN
> -------------------------------------------------------------------------------------
> Limit (cost=0.00..0.87 rows=10 width=12)
> -> Unique (cost=0.00..86.75 rows=1000 width=12)
> -> Index Scan using ti on test_trgm (cost=0.00..84.25 rows=1000 width=12)
> Order By: (t <-> 'foo'::text)
> (4 rows)
>
> Perhaps it would be close enough to what you want to use DISTINCT ON:
>
> contrib_regression=# explain select distinct on( t <-> 'foo') *,t <-> 'foo' from test_trgm order by t <-> 'foo' limit 10;
> QUERY PLAN
> -------------------------------------------------------------------------------------
> Limit (cost=0.00..0.87 rows=10 width=12)
> -> Unique (cost=0.00..86.75 rows=1000 width=12)
> -> Index Scan using ti on test_trgm (cost=0.00..84.25 rows=1000 width=12)
> Order By: (t <-> 'foo'::text)
> (4 rows)
>
> regards, tom lane

good tip - it's working

thank you

Regards

Pavel

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-10-22 16:08:32 Re: [PATCH] Support for Array ELEMENT Foreign Keys
Previous Message Tom Lane 2012-10-22 15:57:35 Re: ToDo: KNN Search should to support DISTINCT clasuse?