GiST range-contained-by searches versus empty ranges

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: GiST range-contained-by searches versus empty ranges
Date: 2011-11-26 19:43:36
Message-ID: 3251.1322336616@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I started to wonder why the test in range_gist_consistent_int() for
RANGESTRAT_CONTAINED_BY was "return true" (ie, search the entire index)
rather than range_overlaps, which is what is tested in the comparable
case in rtree_internal_consistent(). The regression tests showed me
how come: an empty-range index entry should be reported as being
contained by anything, but range_overlaps won't necessarily find such
entries. (The rtree case is all right because there is no equivalent of
an empty range in boxes, circles, or polygons.)

I am not satisfied with this state of affairs though: range_contained_by
really ought to be efficiently indexable, but with the current coding
an index search is nearly useless. Also, so far as I can see, the
current penalty function allows empty-range entries to be scattered
basically anywhere in the index, making a search for "= empty" pretty
inefficient too.

The first solution that comes to mind is to make the penalty and
picksplit functions forcibly segregate empty ranges from others, that is
a split will never put empty ranges together with non-empty ones. Then,
we can assume that a non-empty internal node doesn't represent any empty
leaf entries, and avoid descending to it when it doesn't overlap the
target range. Then the equality-search case could be improved too.

Thoughts, better ideas?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Shulgin 2011-11-26 19:46:32 Re: Notes on implementing URI syntax for libpq
Previous Message Dimitri Fontaine 2011-11-26 19:36:40 Re: Prep object creation hooks, and related sepgsql updates