Indexam interface proposal

Lists: pgsql-hackers
From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Indexam interface proposal
Date: 2007-03-19 12:23:01
Message-ID: 45FE80A5.9080208@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Currently amgettuple returns one matching tuple at a time, in index
order. I'm proposing two changes to add support for
- candidate matches
- partial ordering

Those two features are essential to make clustered indexes work, and in
the future, binned bitmap indexes that don't have a bitmap for each
distinct value but for ranges of values.

There's a third feature looming in the future, that I haven't addressed:
- returning index values, for index-only scans or at least for filtering
rows before fetching heap tuples.

I'm proposing that we keep the one tuple per call nature of the
interface, but add a flag to mark candidate matches. index_getnext or
the executor would need to recheck the original quals for tuples marked
as candidates.

Another flag would be used to mean "this tuple marks the boundary of a
partial ordering". Let's call it boundary_mark for now.

For example, if an index scan returned tuples with the following keys,
with tuples on same line meaning the index doesn't know their relative
ordering.

1
3 4 2
5 8 6 7
9
10

amgettuple would return the above tuples like this:

1 3 4 2 5 8 6 7 9 10
* * * * *

where the tuples marked with * would have the boundary_mark-flag set. If
the plan requires ordered results, index_getnext would have to sort
tuples between two markers before returning them to the caller.

amgetmulti would also need to have the candidate-flag added as I
proposed in the "Bitmapindexscan changes" patch I sent earlier to
pgsql-patches.

This interface change would solve much of the ugliness of my GIT patch,
by generalizing the index quals checking and sorting code to index_getnext.

Another source of ugliness in the patch is in inserting new tuples.
Inserts need to reach to the heap to fetch heap tuples, to compare keys
when splitting a group. I don't see a clean fix for that, but I don't
think it's as bad as the index scan code.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Indexam interface proposal
Date: 2007-03-19 12:59:46
Message-ID: 20070319125946.GB28777@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 19, 2007 at 12:23:01PM +0000, Heikki Linnakangas wrote:
> Currently amgettuple returns one matching tuple at a time, in index
> order. I'm proposing two changes to add support for
> - candidate matches

IIRC indexes can already ask to have the system recheck conditions on
returned tuples. For example GiST can return more tuples than actually
match. That's what the amopreqcheck column is for in pg_amop.

Or am I misunderstanding?

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Indexam interface proposal
Date: 2007-03-19 13:11:15
Message-ID: 45FE8BF3.8010901@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Martijn van Oosterhout wrote:
> On Mon, Mar 19, 2007 at 12:23:01PM +0000, Heikki Linnakangas wrote:
>> Currently amgettuple returns one matching tuple at a time, in index
>> order. I'm proposing two changes to add support for
>> - candidate matches
>
> IIRC indexes can already ask to have the system recheck conditions on
> returned tuples. For example GiST can return more tuples than actually
> match. That's what the amopreqcheck column is for in pg_amop.

Right, except that flag is per operator in operator class, and what I'm
proposing is that the index could pass a flag per tuple in the scan.
Some tuples in the scan might need rechecking, some might not. The need
for rechecking in clustered indexes is orthogonal to the need arising
from the lossyness of GiST operators.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Indexam interface proposal
Date: 2007-03-19 13:40:52
Message-ID: 45FE92E4.8020201@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Right, except that flag is per operator in operator class, and what I'm
> proposing is that the index could pass a flag per tuple in the scan.

That might make sense even for GiST. Sometimes complex compressions is used in
GiST opclasses. If indexing value is rather small then it's stored in index as
is, but large value is compressed with lossy techniques. So, GiST might return a
tuple which is allowed to not recheck.
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Indexam interface proposal
Date: 2007-03-19 13:56:16
Message-ID: 45FE9680.8040107@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Teodor Sigaev wrote:
>> Right, except that flag is per operator in operator class, and what
>> I'm proposing is that the index could pass a flag per tuple in the scan.
>
> That might make sense even for GiST. Sometimes complex compressions is
> used in GiST opclasses. If indexing value is rather small then it's
> stored in index as is, but large value is compressed with lossy
> techniques. So, GiST might return a tuple which is allowed to not recheck.

Interesting. So we'd add a flag to the index tuples in GiST indicating
if the tuple is lossily compressed or not. The compress-function would
set that flag when it performs lossy compression, and gistgettuple would
return it to the caller.

That would completely replace the current RECHECK-option we have, right?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Indexam interface proposal
Date: 2007-03-19 13:58:10
Message-ID: 20070319135810.GD28777@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 19, 2007 at 04:40:52PM +0300, Teodor Sigaev wrote:
> >Right, except that flag is per operator in operator class, and what I'm
> >proposing is that the index could pass a flag per tuple in the scan.
>
> That might make sense even for GiST. Sometimes complex compressions is used
> in GiST opclasses. If indexing value is rather small then it's stored in
> index as is, but large value is compressed with lossy techniques. So, GiST
> might return a tuple which is allowed to not recheck.

Given that rechecking requires Expr and state structures, maybe it would
be easier to make the operators RECHECK so the planner does the right
thing now, but make a flag that tells the index scan *not* to recheck
this tuple. That would seem slightly less work and fit better with the
existing code. (In other words, it's an optimisation rather than a big
change).

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Indexam interface proposal
Date: 2007-03-19 14:14:22
Message-ID: 5020.1174313662@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas <heikki(at)enterprisedb(dot)com> writes:
> Martijn van Oosterhout wrote:
>> IIRC indexes can already ask to have the system recheck conditions on
>> returned tuples. For example GiST can return more tuples than actually
>> match. That's what the amopreqcheck column is for in pg_amop.

> Right, except that flag is per operator in operator class, and what I'm
> proposing is that the index could pass a flag per tuple in the scan.

The reason for attaching the flag to operators is so that the system
(particularly the planner) can tell *which* conditions need to be
rechecked, and can prepare the necessary expression infrastructure.
I dislike the idea of having to be prepared to do that every time
for every indexscan. The notion of having to be prepared to sort
(according to what?) is even worse.

regards, tom lane


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Indexam interface proposal
Date: 2007-03-19 14:27:58
Message-ID: 45FE9DEE.5090600@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Interesting. So we'd add a flag to the index tuples in GiST indicating
> if the tuple is lossily compressed or not. The compress-function would
> set that flag when it performs lossy compression, and gistgettuple would
> return it to the caller.

Keys in GiST index may be another type than column on which index was created,
so gistgettuple can not return tuple in original form - but sometimes
gistgettuple may be sure that recheck isn't needed.

> That would completely replace the current RECHECK-option we have, right?
Yeah, this is possible.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Indexam interface proposal
Date: 2007-03-19 14:36:23
Message-ID: 45FE9FE7.8090807@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Given that rechecking requires Expr and state structures, maybe it would
> be easier to make the operators RECHECK so the planner does the right
> thing now, but make a flag that tells the index scan *not* to recheck
> this tuple. That would seem slightly less work and fit better with the
> existing code. (In other words, it's an optimisation rather than a big
> change).

I like your suggestion - it's useful for GiST/GIN fulltext indexing.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Indexam interface proposal
Date: 2007-03-20 18:38:36
Message-ID: 46002A2C.1040100@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Heikki Linnakangas <heikki(at)enterprisedb(dot)com> writes:
>> Martijn van Oosterhout wrote:
>>> IIRC indexes can already ask to have the system recheck conditions on
>>> returned tuples. For example GiST can return more tuples than actually
>>> match. That's what the amopreqcheck column is for in pg_amop.
>
>> Right, except that flag is per operator in operator class, and what I'm
>> proposing is that the index could pass a flag per tuple in the scan.
>
> The reason for attaching the flag to operators is so that the system
> (particularly the planner) can tell *which* conditions need to be
> rechecked, and can prepare the necessary expression infrastructure.
> I dislike the idea of having to be prepared to do that every time
> for every indexscan.

I don't see any big down-side in preparing for that. We'd need to always
store the original index quals in the executor node, like we do now with
recheck-flagged operators, but that doesn't seem too bad to me.

I suppose we would want to keep the existing per-operator recheck-flag
and quals as it is, and add another field like indexqualorig to be used
to recheck tuples amgetnext flags as candidates.

> The notion of having to be prepared to sort
> (according to what?) is even worse.

That we wouldn't need for clustered indexes, if we change the current
design a bit. Either:
* store a sorted list of offsetnumbers for each group, instead of a bitmap,
* or store a bitmap like now, but require that heap tuples in a grouped
index tuple are in cluster order within the heap page.

The first option eats away some of the space savings, the second option
makes clustered indexes to become declustered quicker if there's
out-of-order updates or inserts.

Choosing either option would also reduce the CPU overhead of index
scans, because we could use binary search within a grouped index tuple.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Indexam interface proposal
Date: 2008-04-10 18:50:29
Message-ID: 200804101850.m3AIoTa22208@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Added to TODO:

* Allow index scans to return matching index keys

http://archives.postgresql.org/pgsql-hackers/2007-03/msg01079.php

---------------------------------------------------------------------------

Heikki Linnakangas wrote:
> Currently amgettuple returns one matching tuple at a time, in index
> order. I'm proposing two changes to add support for
> - candidate matches
> - partial ordering
>
> Those two features are essential to make clustered indexes work, and in
> the future, binned bitmap indexes that don't have a bitmap for each
> distinct value but for ranges of values.
>
> There's a third feature looming in the future, that I haven't addressed:
> - returning index values, for index-only scans or at least for filtering
> rows before fetching heap tuples.
>
>
> I'm proposing that we keep the one tuple per call nature of the
> interface, but add a flag to mark candidate matches. index_getnext or
> the executor would need to recheck the original quals for tuples marked
> as candidates.
>
> Another flag would be used to mean "this tuple marks the boundary of a
> partial ordering". Let's call it boundary_mark for now.
>
> For example, if an index scan returned tuples with the following keys,
> with tuples on same line meaning the index doesn't know their relative
> ordering.
>
> 1
> 3 4 2
> 5 8 6 7
> 9
> 10
>
> amgettuple would return the above tuples like this:
>
> 1 3 4 2 5 8 6 7 9 10
> * * * * *
>
> where the tuples marked with * would have the boundary_mark-flag set. If
> the plan requires ordered results, index_getnext would have to sort
> tuples between two markers before returning them to the caller.
>
> amgetmulti would also need to have the candidate-flag added as I
> proposed in the "Bitmapindexscan changes" patch I sent earlier to
> pgsql-patches.
>
> This interface change would solve much of the ugliness of my GIT patch,
> by generalizing the index quals checking and sorting code to index_getnext.
>
> Another source of ugliness in the patch is in inserting new tuples.
> Inserts need to reach to the heap to fetch heap tuples, to compare keys
> when splitting a group. I don't see a clean fix for that, but I don't
> think it's as bad as the index scan code.
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +