Re: Page at a time index scan

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Page at a time index scan
Date: 2006-05-03 09:14:38
Message-ID: 1146647678.449.42.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

On Tue, 2006-05-02 at 15:35 -0400, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
> > On Tue, 2 May 2006, Tom Lane wrote:
> >> Backwards scan may break this whole concept; are you sure you've thought
> >> it through?
>
> > I think so. The patch doesn't change the walk-left code. Do you have
> > something specific in mind?
>
> I'm worried about synchronization, particularly what happens if the page
> gets deleted from under you while you don't have it pinned.

Perhaps I should update my comments on "we don't need a pin at all"...

On a Forward scan we need to pin while we are reading a whole page,
though can release the pin afterwards. We don't need to keep the pin
while servicing btgetnext() requests from our private page buffer
though. (Which is what I meant to say.)

AFAICS we will need to return to the page for a backward scan, so we
could just keep the pin the whole way. It's not possible to cache the
left page pointer because block splits to our immediate left can update
them even after we read the page contents. (A forward scan need never
fear page splits in the same way because existing items can't move past
the existing page boundary).

We need never return to a page that *could* be deleted. While scanning
in either direction, if the complete page contains nothing but dead
items we can simply move straight onto the next page, having updated the
page status to half-dead. (The great thing about this patch is we should
be able to report that somehow, so an asynchronous task handler can come
and clean that page (only) now that we don't have a restriction on
individual page vacuuming. We can think about somehow later)

If only some of the index tuples are deleted, we should only return to
the page to update the deleted index tuples *if*:
- the page is still in the buffer pool. If its been evicted its because
space is tight so we shouldn't call it back just to dirty the page.
- we have a minimum threshold of deleted tuples. Otherwise we might
re-dirty the page for just a single hint bit, so we end up writing the
page out hundreds of times. (Guess: that should be 2 or 3)

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Martijn van Oosterhout 2006-05-03 13:54:57 Re: patch review, please: Autovacuum/Vacuum times via stats.
Previous Message Heikki Linnakangas 2006-05-03 07:28:24 Re: Page at a time index scan