Re: Index-only-scans, indexam API changes

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: Index-only-scans, indexam API changes
Date: 2009-07-13 15:32:22
Message-ID: 4A5B5386.3070100@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> One thought here is that an AM call isn't really free, and doing two of
> them instead of one mightn't be such a good idea. I would suggest
> either having a separate AM entry point to get both bits of data
> ("amgettupledata"?) or adding an optional parameter to amgettuple.

I'm thinking of adding a new flag to IndexScanDesc, "needIndexTuple".
When that is set, amgettuple stores a pointer to the IndexTuple in
another new field in IndexScanDesc, xs_itup. (In case of b-tree at
least, the IndexTuple is a palloc'd copy of the tuple on disk)

> [ thinks a bit ... ] At least for GIST, it is possible that whether
> data can be regurgitated will vary depending on the selected opclass.
> Some opclasses use the STORAGE modifier and some don't. I am not sure
> how hard we want to work to support flexibility there. Would it be
> sufficient to hard-code the check as "pgam says the AM can do it,
> and the opclass does not have a STORAGE property"? Or do we need
> additional intelligence about GIST opclasses?

Well, the way I have it implemented is that the am is not required to
return the index tuple, even if requested. I implemented the B-tree
changes similar to how we implement "kill_prior_tuple": in btgettuple,
lock the index page and see if the tuple is still at the same position
that we remembered in the scan opaque struct. If not (because of
concurrent changes to the page), we give up and don't return the index
tuple. The executor will then perform a heap fetch as before.

Alternatively, we could copy all the matching index tuples to private
memory when we step to a new index page, but that seems pretty expensive
if we're only going to use the first few matching tuples (LIMIT), and I
fear the memory management gets complicated. But in any case, the GiST
issue would still be there.

Since we're discussing it, I'm attaching the prototype patch I have for
returning tuples from b-tree and using them to filter rows before heap
fetches. I was going to post it in the morning along with description
about the planner and executor changes, but here goes. It applies on top
of the indexam API patch I started this thread with.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
indexfilter-1.patch text/x-diff 40.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2009-07-13 15:34:09 Re: Upgrading our minimum required flex version for 8.5
Previous Message Tom Lane 2009-07-13 15:16:23 Re: Index-only-scans, indexam API changes