Re: New VACUUM FULL

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: New VACUUM FULL
Date: 2009-12-21 10:54:45
Message-ID: 1261392885.17644.2382.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2009-12-07 at 16:55 +0900, Itagaki Takahiro wrote:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> > You should take those out again; if I am the committer I certainly will.
> > Such a test will guarantee complete instability of every other
> > regression test, and it's not worth it.
>
> I read the original comment was saying to add regression tests for
> database-wide vacuums. But I'll reduce the range of vacuum if they
> are not acceptable.
>
> The new patch contains only table-based vacuum for local tables and some of
> system tables to test non-INPLACE vacuum are not used for system tables.
> VACUUM FULL pg_am;
> VACUUM FULL pg_class;
> VACUUM FULL pg_database;

Thanks for adding those additional tests.

I notice that during copy_heap_data() we make no attempt to skip pages
that are all visible according to the visibilitymap. It seems like it
would be a substantial win to copy whole blocks if all the
pre-conditions are met (I see what they are). I'm surprised to see that
neither CLUSTER nor VACUUM FULL made use of this previously. I think we
either need to implement that or document that vacuum will not skip
all-visible pages when running VACUUM FULL.

Also, I notice that if we perform new VACUUM FULL on a table it will
fully re-write the table and rebuild indexes, even if it hasn't found a
single row to remove.

Old VACUUM FULL was substantially faster than this on tables that had
nothing to remove. We aren't asking users to recode anything, so many
people will be performing "VACUUM FULL;" as usual every night or
weekend. If they do that it will result in substantially longer run
times in many databases, all while holding AccessExclusiveLocks.

Please can you arrange for the cluster operation to skip rebuilding
indexes if the table is not reduced in size?

Part of the reason why these happen is that the code hasn't been
refactored much at all from its original use for cluster. There are
almost zero comments to explain the additional use of this code for
VACUUM, or at least to explain it still all works even when there is no
index. e.g. check_index_is_clusterable() ought not to be an important
routine when there is no index being clustered. I'm seeing that the code
all works but that this patch isn't yet a sufficiently permanent change
to the code for me to commit, though it could be soon.

--
Simon Riggs www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2009-12-21 11:34:01 Re: alpha3 release schedule?
Previous Message Tim Bunce 2009-12-21 10:52:26 Re: Initial refactoring of plperl.c [PATCH]