Re: [PERFORM] encouraging index-only scans

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PERFORM] encouraging index-only scans
Date: 2014-02-11 17:12:13
Message-ID: 20140211171213.GE2289@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Tue, Feb 11, 2014 at 11:28:36AM -0500, Robert Haas wrote:
> A sequential scan will set hint bits and will prune the page, but
> pruning the page doesn't ever mark it all-visible; that logic is
> entirely in vacuum. If that could be made cheap enough to be
> negligible, it might well be worth doing in heap_page_prune(). I
> think there might be a way to do that, but it's a bit tricky because
> the pruning logic iterates over the page in a somewhat complex way,
> not just a straightforward scan of all the item pointers the way the
> existing logic doesn't. It would be pretty cool if we could just use
> a bit out of the heap-prune xlog record to indicate whether the
> all-visible bit should be set; then we'd gain the benefit of marking
> things all-visible much more often without needing vacuum.
>
> That doesn't help insert-only tables much, though, because those won't
> require pruning. We set hint bits (which dirties the page) but
> currently don't write WAL. We'd have to change that to set the
> all-visible bit when scanning such a table, and that would be
> expensive. :-(

Yes, that pretty much sums it up. We introduced index-only scans in 9.2
(2012) but they still seem to be not usable for insert-only workloads
two years later. Based on current progress, it doesn't look like this
will be corrected until 9.5 (2015). I am kind of confused why this has
not generated more urgency.

I guess my question is what approach do we want to take to fixing this?
If we are doing pruning, aren't we emitting WAL? You are right that for
an insert-only workload, we aren't going to prune, but if pruning WAL
overhead is acceptable for a sequential scan, isn't index-only
page-all-visible WAL overhead acceptable?

Do we want to track the number of inserts in statistics and trigger an
auto-vacuum after a specified number of inserts? The problem there is
that we really don't need to do any index cleanup, which is what vacuum
typically does --- we just want to scan the table and set the
all-visible bits, so that approach seems non-optimal.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-02-11 17:15:26 Re: narwhal and PGDLLIMPORT
Previous Message Bruce Momjian 2014-02-11 16:37:41 Re: Performance Improvement by reducing WAL for Update Operation

Browse pgsql-performance by date

  From Date Subject
Next Message Andres Freund 2014-02-11 17:54:10 Re: [PERFORM] encouraging index-only scans
Previous Message Robert Haas 2014-02-11 16:28:36 Re: [PERFORM] encouraging index-only scans