Re: index-only scans

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: index-only scans
Date: 2011-10-12 16:39:36
Message-ID: 15104.1318437576@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
>> I was also toying with the notion of pushing the slot fill-in into the
>> AM, so that the AM API is to return a filled TupleSlot not an
>> IndexTuple. This would not save any cycles AFAICT --- at least in
>> btree, we still have to make a palloc'd copy of the IndexTuple so that
>> we can release lock on the index page. The point of it would be to
>> avoid the assumption that the index's internal storage has exactly the
>> format of IndexTuple. Don't know whether we'd ever have any actual use
>> for that flexibility, but it seems like it wouldn't cost much to
>> preserve the option.

> BTW, I concluded that that would be a bad idea, because it would involve
> the index AM in some choices that we're likely to want to change. In
> particular it would foreclose ever doing anything with expression
> indexes, without an AM API change. Better to just define the AM's
> responsibility as to hand back a tuple defined according to the index's
> columns.

Although this aspect of the code is now working well enough for btree,
I realized that it's going to have a problem if/when we add GiST
support. The difficulty is that the index rowtype includes "storage"
datatypes, not underlying-heap-column datatypes, for opclasses where
those are different. This is not going to do for cases where we need
to reconstruct a heap value from the index contents, as in Alexander's
example of gist point_ops using a box as the underlying storage.

What we actually want back from the index AM is a rowtype that matches
the list of values submitted for indexing (ie, the original output of
FormIndexDatum), and only for btree is it the case that that's
represented more or less exactly as the IndexTuple stored in the index.

So what I'm now thinking is to go back to the idea of having the index
AM fill in a TupleTableSlot. For btree this would just amount to moving
the existing StoreIndexTuple function into the AM. But it would give
GiST the opportunity to do some computation, and it would avoid the
problem of the index's rowtype not being a suitable intermediate format.
The objection I voiced above is misguided, because it confuses the set
of column types that's needed with the format distinction between a Slot
and an IndexTuple.

BTW, right at the moment I'm not that excited about actually doing
any work on GiST itself for index-only scans. Given the current list of
available opclasses there don't seem to be very many for which
index-only scans would be possible, so the amount of work needed seems
rather out of proportion to the benefit. But I don't mind fixing AM API
details that are needed to make this workable. I'd rather have the API
as right as possible in the first release.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2011-10-12 16:45:31 Re: pg_ctl restart - behaviour based on wrong instance
Previous Message Magnus Hagander 2011-10-12 16:39:28 Re: [BUGS] *.sql contrib files contain unresolvable MODULE_PATHNAME