Re: Minmax indexes

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Minmax indexes
Date: 2013-09-30 17:20:19
Message-ID: 5249B2D3.6030606@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 30.09.2013 19:49, Alvaro Herrera wrote:
> David Fetter wrote:
>> On Mon, Sep 30, 2013 at 02:17:39PM +0300, Heikki Linnakangas wrote:
>>> What would it take to abstract the minmax indexes to allow maintaing
>>> a bounding box for points, instead of a plain min/max? Or for
>>> ranges. In other words, why is this restricted to b-tree operators?
>>
>> If I had to guess, I'd guess, "first cut."
>
> Yeah, there were a few other simplifications in the design too, though I
> admit allowing for multidimensional dataypes hadn't occured to me

You can almost create a bounding box opclass in the current
implementation, by mapping < operator to "contains" and > to "not
contains". But there's no support for creating a new, larger, bounding
box on insert. It will just replace the max with the new value if it's
"greater than", when it should create a whole new value to store in the
index that covers both the old and the new values. (or less than? I'm
not sure which way those operators would work..)

When you think of the general case, it's weird that the current
implementation requires storing both the min and the max. For a bounding
box, you store the bounding box that covers all heap tuples in the
range. If that corresponds to "max", what does "min" mean?

In fact, even with regular b-tree operators, over integers for example,
you don't necessarily want to store both min and max. If you only ever
perform queries like "WHERE col > ?", there's no need to track the min
value. So to make this really general, you should be able to create an
index on only the minimum or maximum. Or if you want both, you can store
them as separate index columns. Something like:

CREATE INDEX minindex ON table (col ASC); -- For min
CREATE INDEX minindex ON table (col DESC); -- For max
CREATE INDEX minindex ON table (col ASC, col DESC); -- For both

That said, in practice most people probably want to store both min and
max. Maybe it's a bit too finicky if we require people to write "col
ASC, col DESC" to get that. Some kind of a shorthand probably makes sense.

> (though I will guess Simon did think about it and just didn't tell me to
> avoid me going overboard with stuff that would make the first version
> take forever).

Heh, and I ruined that great plan :-).

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-09-30 18:22:22 Re: Cmpact commits and changeset extraction
Previous Message Andres Freund 2013-09-30 17:00:06 Re: Wait free LW_SHARED acquisition