Re: Using multidimensional indexes in ordinal queries

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thom Brown <thombrown(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using multidimensional indexes in ordinal queries
Date: 2010-06-21 21:20:31
Message-ID: AANLkTim8yryTqBV7rVucvN1h9SpIMgt8fFEmXJSO1_sE@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 21, 2010 at 5:42 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> It seems like you can get more or less the same benefit from a
> multicolumn btree index. On my system, with the individual btree
> indices, the query ran in 7625 ms; with an additional index on (v1,
> v2, v3), it ran in 94 ms. I didn't get the same plans as Alexander
> did, though, so it may not really be apples to apples. See attached
> session trace.
>
Benefit of multicolumn btree index was more or less the same than cube
benefit because of very bad picksplit behavior in this case. I attached the
patch which significally improves cube index search performance:

test=# explain (analyze, buffers) select * from test where
cube(ARRAY[v1,v2,v3]) <@ cube(ARRAY[480,480,'-inf'::float8],
ARRAY[500,500,'+inf'::float8]) order by cube(ARRAY[v1,v2,v3]) <*> 4 LIMIT
10;
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..38.07 rows=10 width=28) (actual time=0.495..0.570
rows=10 loops=1)
Buffers: shared hit=21
-> Index Scan using test_cube_idx on test (cost=0.00..38064.52
rows=10000 width=28) (actual time=0.489..0.537 rows=10 loops=1)
Index Cond: (cube(ARRAY[v1, v2, v3]) <@ '(480, 480, -inf),(500,
500, inf)'::cube)
Sort Cond: (cube(ARRAY[v1, v2, v3]) <*> 4)
Buffers: shared hit=21
Total runtime: 0.659 ms
(7 rows)

Now this patch greatly increases tree construction time, but I believe that
picksplit implementation, that is good enough for tree search and tree
construction, can be found.

> The trouble is that it's hard to think of
> a way of teaching the planner about these cases without hard-coding
> lots and lots of special-case kludges into the planner. Still, if
> someone has a clever idea...

I think that two things can be done to improve the situation:
1) Make knngist deal with negative values. I think this will make easier
using knngist just for sorting, not only k-neighbor searching.
2) Let gist interface methods take care about multicolumn indexes. I think
that if cube index from the example above will be constructed on separate
columns v1, v2, v3 then it would be easier for planner to use cube index for
queries with filters on these columns. I don't know exactly how to do that.

Attachment Content-Type Size
cube.gz application/x-gzip 4.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-06-21 21:58:44 Re: Using multidimensional indexes in ordinal queries
Previous Message Andy Balholm 2010-06-21 21:07:14 Re: dividing money by money