TODO item: Allow data to be pulled directly from indexes

Lists: pgsql-hackers
From: Karl Schnaitter <karlsch(at)soe(dot)ucsc(dot)edu>
To: pgsql-hackers(at)postgresql(dot)org
Subject: TODO item: Allow data to be pulled directly from indexes
Date: 2008-06-29 11:59:11
Message-ID: 4867790F.9050902@soe.ucsc.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sometime last year, a discussion started about including visibility
metadata to avoid heap fetches during an index scan:

http://archives.postgresql.org/pgsql-patches/2007-10/msg00166.php
http://archives.postgresql.org/pgsql-patches/2008-01/msg00049.php

I think the last discussion on this was in April:

http://archives.postgresql.org/pgsql-hackers/2008-04/msg00618.php (last
item)

I have worked with the current patch, and I have some thoughts about
that approach and the approaches listed in the TODO item. The TODO lists
three approaches, in short

(1) Add a bit for an index tuple that indicates "visible" or "maybe
visible."
(2) Use a per-table bitmap that indicates which pages have at least one
tuple that is not visible to all transactions.
(3) Same as (2) but at the granularity of one bit per table.

The approach in the patch is different:

(4) Add transaction ids, etc to the index tuple (totaling 16 bytes)

I would group (1) & (4) together and (2) & (3) together. I think the
time and space trade-offs are pretty obvious, so I won't waste time on
those.

(1) & (4) require an UPDATE or DELETE to twiddle the old index tuple.
Tom has noted (in the linked message) that this is not reliable if the
index has any expression-valued columns, because it is not always
possible to find the old index entry. For this reason, the proposed
patch does not keep visibility metadata for indexes on expressions. This
seems like a reasonable limitation --- indexed expressions are just less
efficient.

The main difference between (1) & (4) is that (1) will sometimes require
heap lookups and (4) never will. Moreover, the heap lookups in (1) will
be difficult for the optimizer to estimate, unless some special
statistics can be maintained for this purpose.

I should mention there is a major flaw in the patch, because it puts
pointers to HOT tuples in the index, in order to capture the different
transaction ids in the chain. I think this can be fixed by only pointing
to the root of the HOT chain, and setting xmin/xmax to the entire range
of transaction ids spanned by the chain. I'm not sure about all the
details (the ctid and some other bits also need to be set).

(2) & (3) can work for any index, and they are quite elegant in the way
that the overhead does not change with the number of indexes. The TODO
also notes the benefit of (2) for efficient vacuuming. Thus, I think
that (2) is a great idea in general, but it does not serve the intended
purpose of this TODO item. Once a page gets marked as requiring
visibility checks, it cannot be unmarked until the next VACUUM. The
whole point of this feature is that we are willing to be more proactive
during updates in order to make index access more efficient.

So in summary, I think that (2) would be nice as a separate feature,
with (1) and (4) being more favorable for index-only scans. The obvious
trouble with (4) is the extra space overhead. There are also issues with
correctness that I mentioned (any thoughts here would be appreciated).
Other than that, I would favor (4) because it offers the most stable
performance.

Please let me know if you agree/disagree with anything here. I need to
get this feature implemented for my research, but I would also love to
contribute it to the community so your opinions matter a lot.


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Karl Schnaitter" <karlsch(at)soe(dot)ucsc(dot)edu>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TODO item: Allow data to be pulled directly from indexes
Date: 2008-06-29 17:18:25
Message-ID: 87bq1kp3by.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Karl Schnaitter" <karlsch(at)soe(dot)ucsc(dot)edu> writes:

"Karl Schnaitter" <karlsch(at)soe(dot)ucsc(dot)edu> writes:

> (1) & (4) require an UPDATE or DELETE to twiddle the old index tuple. Tom has
> noted (in the linked message) that this is not reliable if the index has any
> expression-valued columns, because it is not always possible to find the old
> index entry. For this reason, the proposed patch does not keep visibility
> metadata for indexes on expressions. This seems like a reasonable limitation
> --- indexed expressions are just less efficient.

Or if the index operators and btproc aren't nearly as immutable as they claim.
Probably less likely than non-immutable index expressions but also possible.

> I should mention there is a major flaw in the patch, because it puts pointers
> to HOT tuples in the index, in order to capture the different transaction ids
> in the chain. I think this can be fixed by only pointing to the root of the HOT
> chain, and setting xmin/xmax to the entire range of transaction ids spanned by
> the chain. I'm not sure about all the details (the ctid and some other bits
> also need to be set).

I think you can think of a HOT chain as a single tuple. The xmin of the head
is the xmin of the chain and the xmax of the tail is the xmax of the chain.
The xmin/xmax of the intermediate versions are only interesting for
determining *which* of the HOT versions to look at, but the index pointer
points to the whole chain.

> (2) & (3) can work for any index, and they are quite elegant in the way that
> the overhead does not change with the number of indexes. The TODO also notes
> the benefit of (2) for efficient vacuuming. Thus, I think that (2) is a great
> idea in general, but it does not serve the intended purpose of this TODO item.
> Once a page gets marked as requiring visibility checks, it cannot be unmarked
> until the next VACUUM. The whole point of this feature is that we are willing
> to be more proactive during updates in order to make index access more
> efficient.

Well I think that's precisely the point. If you're trading off work done at
update time against work done for index accesses then you're only going to win
if the tuples are relatively static and have lots of accesses done against
them between updates. In which case having the optimization only kick in when
the page has been static for long enough that all the tuples are globally
visible should be good enough.

The case where index visibility info might win over a visibility map might be
if the tuples are being heavily updated by long-lived transactions. In which
case they never sit globally visible for very long but having the xmin/xmax in
the index might avoid having to do a heap access for tuples which haven't been
committed yet.

As you seem to realize there has been a lot of discussion in this area
already. The visibility map looks like a much more popular direction.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!


From: Karl Schnaitter <karlsch(at)soe(dot)ucsc(dot)edu>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TODO item: Allow data to be pulled directly from indexes
Date: 2008-06-29 19:33:24
Message-ID: 4867E384.9080406@soe.ucsc.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark wrote:
>> (1) & (4) require an UPDATE or DELETE to twiddle the old index tuple. Tom has
>> noted (in the linked message) that this is not reliable if the index has any
>> expression-valued columns, because it is not always possible to find the old
>> index entry. For this reason, the proposed patch does not keep visibility
>> metadata for indexes on expressions. This seems like a reasonable limitation
>> --- indexed expressions are just less efficient.
>>
>
> Or if the index operators and btproc aren't nearly as immutable as they claim.
> Probably less likely than non-immutable index expressions but also possible.
>
>
Your point is well taken... I'll have to look into that more.

>> (2) & (3) can work for any index, and they are quite elegant in the way that
>> the overhead does not change with the number of indexes. The TODO also notes
>> the benefit of (2) for efficient vacuuming. Thus, I think that (2) is a great
>> idea in general, but it does not serve the intended purpose of this TODO item.
>> Once a page gets marked as requiring visibility checks, it cannot be unmarked
>> until the next VACUUM. The whole point of this feature is that we are willing
>> to be more proactive during updates in order to make index access more
>> efficient.
>>
>
> Well I think that's precisely the point. If you're trading off work done at
> update time against work done for index accesses then you're only going to win
> if the tuples are relatively static and have lots of accesses done against
> them between updates. In which case having the optimization only kick in when
> the page has been static for long enough that all the tuples are globally
> visible should be good enough
I really don't understand this point. The way I see the visibility map
working is as follows: we set a page to "requires visibility check" when
a tuple on the page is inserted, deleted, or non-HOT updated. If the
only modifications have been inserts, we can reset the status to "all
tuples visible" when these tuples become universally visible, which
matches your description. But in the presence of deletes and updates, we
can only reset the status of a page after a VACUUM (I know that dead HOT
tuples can be pruned without VACUUM, but I don't think that's the case
for indexed tuples). We can't reset the status earlier because we don't
know what indexes still have pointers to the dead tuples. So a page can
be static indefinitely (after a single modification) without ever
getting to enjoy the optimization.

This is a really important point, so please let me know if I'm missing
something.

Thanks for your response!
Karl


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Karl Schnaitter" <karlsch(at)soe(dot)ucsc(dot)edu>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TODO item: Allow data to be pulled directly from indexes
Date: 2008-06-30 07:33:36
Message-ID: 48688C50.2090702@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Karl Schnaitter wrote:
> The main difference between (1) & (4) is that (1) will sometimes require
> heap lookups and (4) never will. Moreover, the heap lookups in (1) will
> be difficult for the optimizer to estimate, unless some special
> statistics can be maintained for this purpose.

Yeah, we certainly should maintain a statistic for it.

> (2) & (3) can work for any index, and they are quite elegant in the way
> that the overhead does not change with the number of indexes. The TODO
> also notes the benefit of (2) for efficient vacuuming. Thus, I think
> that (2) is a great idea in general, but it does not serve the intended
> purpose of this TODO item. Once a page gets marked as requiring
> visibility checks, it cannot be unmarked until the next VACUUM. The
> whole point of this feature is that we are willing to be more proactive
> during updates in order to make index access more efficient.

In some cases we can mark a page earlier, as soon as we see that the
condition is true. Most importantly, when new tuples are inserted, we
can mark the page as soon as the inserting transaction is visible to all.

Also, the visibility map ought to make vacuums cheaper, as you only need
to scan the parts of the table that have beem modified since last
vacuum. You still need to scan all indexes, though. But assuming that
you somehow solve the correctness issues in the "add visibility fields
to index tuples" approach, we can use the same solution to perform
retail vacuums, which would bring vacuuming and the visibility map
approach on par with that approach anyway.

> Please let me know if you agree/disagree with anything here. I need to get this feature implemented for my research, but I would also love to contribute it to the community so your opinions matter a lot.

Well, I think the visibility map is a much better approach. This has
been discussed many times before, so I don't really have anything new to
add.

I've been working adding support for so-called "relation forks"
(http://archives.postgresql.org/message-id/48651722.8020600@enterprisedb.com),
to allow attaching metadata to relations, like the visibility map. I'm
going to use the facility for a new FSM implementation, which I'm
working on at the moment, but after that's done I'm going to start
working on the visibility map. And after that's done and working for
VACUUM, I'm going to work on using it for index-only-scans.

I'm not sure I have enough time to get all that done for 8.4, it's
looking bad at the moment, so help would be much appreciated. If you
don't agree with taking the visibility map approach, I would suggest
working on the indexam API changes first, to allow returning index
tuples from an index. I believe that part is the same regardless of how
we check the visibility.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Karl Schnaitter" <karlsch(at)soe(dot)ucsc(dot)edu>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TODO item: Allow data to be pulled directly from indexes
Date: 2008-06-30 11:07:04
Message-ID: 87lk0n9o6f.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:

> I'm not sure I have enough time to get all that done for 8.4, it's looking bad
> at the moment, so help would be much appreciated. If you don't agree with
> taking the visibility map approach, I would suggest working on the indexam API
> changes first, to allow returning index tuples from an index. I believe that
> part is the same regardless of how we check the visibility.

That part is the elephant in the room in all these discussions.

I wonder if we want to have a new plan node for the heap accesses so they can
be postponed up above a join or other quals. Even for tuples which are of
questionable visibility we ought to be able to check cheap quals first before
checking visibility (though we might need a new function property to indicate
it's safe to call extra times on data which doesn't really exist -- immutable
doesn't mean it might not throw errors, for example).

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Karl Schnaitter <karlsch(at)soe(dot)ucsc(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TODO item: Allow data to be pulled directly from indexes
Date: 2008-08-20 20:20:03
Message-ID: 200808202020.m7KKK3C20520@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


I have added this email's URL to TODO under tuple visibility.

---------------------------------------------------------------------------

Karl Schnaitter wrote:
> Sometime last year, a discussion started about including visibility
> metadata to avoid heap fetches during an index scan:
>
> http://archives.postgresql.org/pgsql-patches/2007-10/msg00166.php
> http://archives.postgresql.org/pgsql-patches/2008-01/msg00049.php
>
> I think the last discussion on this was in April:
>
> http://archives.postgresql.org/pgsql-hackers/2008-04/msg00618.php (last
> item)
>
> I have worked with the current patch, and I have some thoughts about
> that approach and the approaches listed in the TODO item. The TODO lists
> three approaches, in short
>
> (1) Add a bit for an index tuple that indicates "visible" or "maybe
> visible."
> (2) Use a per-table bitmap that indicates which pages have at least one
> tuple that is not visible to all transactions.
> (3) Same as (2) but at the granularity of one bit per table.
>
> The approach in the patch is different:
>
> (4) Add transaction ids, etc to the index tuple (totaling 16 bytes)
>
> I would group (1) & (4) together and (2) & (3) together. I think the
> time and space trade-offs are pretty obvious, so I won't waste time on
> those.
>
> (1) & (4) require an UPDATE or DELETE to twiddle the old index tuple.
> Tom has noted (in the linked message) that this is not reliable if the
> index has any expression-valued columns, because it is not always
> possible to find the old index entry. For this reason, the proposed
> patch does not keep visibility metadata for indexes on expressions. This
> seems like a reasonable limitation --- indexed expressions are just less
> efficient.
>
> The main difference between (1) & (4) is that (1) will sometimes require
> heap lookups and (4) never will. Moreover, the heap lookups in (1) will
> be difficult for the optimizer to estimate, unless some special
> statistics can be maintained for this purpose.
>
> I should mention there is a major flaw in the patch, because it puts
> pointers to HOT tuples in the index, in order to capture the different
> transaction ids in the chain. I think this can be fixed by only pointing
> to the root of the HOT chain, and setting xmin/xmax to the entire range
> of transaction ids spanned by the chain. I'm not sure about all the
> details (the ctid and some other bits also need to be set).
>
> (2) & (3) can work for any index, and they are quite elegant in the way
> that the overhead does not change with the number of indexes. The TODO
> also notes the benefit of (2) for efficient vacuuming. Thus, I think
> that (2) is a great idea in general, but it does not serve the intended
> purpose of this TODO item. Once a page gets marked as requiring
> visibility checks, it cannot be unmarked until the next VACUUM. The
> whole point of this feature is that we are willing to be more proactive
> during updates in order to make index access more efficient.
>
> So in summary, I think that (2) would be nice as a separate feature,
> with (1) and (4) being more favorable for index-only scans. The obvious
> trouble with (4) is the extra space overhead. There are also issues with
> correctness that I mentioned (any thoughts here would be appreciated).
> Other than that, I would favor (4) because it offers the most stable
> performance.
>
> Please let me know if you agree/disagree with anything here. I need to
> get this feature implemented for my research, but I would also love to
> contribute it to the community so your opinions matter a lot.
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +