Re: [PERFORM] encouraging index-only scans

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Jim Nasby <jim(at)nasby(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PERFORM] encouraging index-only scans
Date: 2013-09-19 18:39:43
Message-ID: CA+TgmoZQyt253OA_ssD6OVs8o0-RkRz3eK6qZz3Utwp-2QVf6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Tue, Sep 17, 2013 at 7:10 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> I generally think the current logic for triggering VACUUMs via
> autovacuum doesn't really make all that much sense in the days where we
> have the visibility map.

Right now, whether or not to autovacuum is the rest of a two-pronged
test. The first prong is based on number of updates and deletes
relative to table size; that triggers a regular autovacuum. The
second prong is based on age(relfrozenxid) and triggers a
non-page-skipping vacuum (colloquially, an anti-wraparound vacuum).

The typical case in which this doesn't work out well is when the table
has a lot of inserts but few or no updates and deletes. So I propose
that we change the first prong to count inserts as well as updates and
deletes when deciding whether it needs to vacuum the table. We
already use that calculation to decide whether to auto-analyze, so it
wouldn't be very novel. We know that the work of marking pages
all-visible will need to be done at some point, and doing it sooner
will result in doing it in smaller batches, which seems generally
good.

However, I do have one concern: it might lead to excessive
index-vacuuming. Right now, we skip the index vac step only if there
ZERO dead tuples are found during the heap scan. Even one dead tuple
(or line pointer) will cause an index vac cycle, which may easily be
excessive. So I further propose that we introduce a threshold for
index-vac; so that we only do index vac cycle if the number of dead
tuples exceeds, say 0.1% of the table size.

Thoughts? Let the hurling of rotten tomatoes begin.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-09-19 18:46:56 Re: UTF8 national character data type support WIP patch and list of open issues.
Previous Message David Johnston 2013-09-19 18:09:58 Re: Not In Foreign Key Constraint

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2013-09-19 20:47:11 Why is n_distinct always -1 for range types?
Previous Message Jim Nasby 2013-09-18 20:28:53 Re: [PERFORM] encouraging index-only scans