Re: KNN-GiST with recheck

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Emre Hasegeli <emre(at)hasegeli(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Subject: Re: KNN-GiST with recheck
Date: 2014-09-29 02:16:31
Message-ID: 20140929021631.GG12447@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 26, 2014 at 10:49:42AM +0400, Alexander Korotkov wrote:
> Does this also fix the identical PostGIS problem or is there something
> PostGIS needs to do?
>
>
> This patch provides general infrastructure for recheck in KNN-GiST. PostGIS
> need corresponding change in its GiST opclass. Since PostGIS already define <->
> and <#> operators as distance to bounding box border and bounding box center,
> it can't change their behaviour.
> it has to support new operator "exact distance" in opclass. 

Ah, OK, so they just need something that can be used for the recheck. I
think they currently use ST_Distance() for that. Does it have to be an
operator? If they defined an operator for ST_Distance(), would
ST_Distance() work too for KNN-GiST?

In summary, you still create a normal GiST index on the column:

http://shisaa.jp/postset/postgis-postgresqls-spatial-partner-part-3.html

CREATE INDEX planet_osm_line_ref_index ON planet_osm_line(ref);

which indexes by the bounding box. The new code will allow ordered
index hits to be filtered by something like ST_Distance(), rather than
having to a LIMIT 50 in a CTE, then call ST_Distance(), like this:

EXPLAIN ANALYZE WITH distance AS (
SELECT way AS road, ref AS route
FROM planet_osm_line
WHERE highway = 'secondary'
ORDER BY ST_GeomFromText('POLYGON((14239931.42 3054117.72,14239990.49 3054224.25,14240230.15 3054091.38,14240171.08 3053984.84,14239931.42 3054117.72))', 900913) <#> way
LIMIT 50
)
SELECT ST_Distance(ST_GeomFromText('POLYGON((14239931.42 3054117.72,14239990.49 3054224.25,14240230.15 3054091.38,14240171.08 3053984.84,14239931.42 3054117.72))', 900913), road) AS true_distance, route
FROM distance
ORDER BY true_distance
LIMIT 1;

Notice the CTE uses <#> (bounding box center), and then the outer query
uses ST_Distance and LIMIT 1 to find the closest item.

Excellent!

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2014-09-29 02:28:52 Re: Collation-aware comparisons in GIN opclasses
Previous Message Gavin Flower 2014-09-29 01:37:51 Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}