Re: new autovacuum criterion for visible pages

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: new autovacuum criterion for visible pages
Date: 2016-08-11 06:29:30
Message-ID: CAB7nPqT5MMdJmBiuxk=B7pAPym5JiQQ5WjPq5s_qupZVhyyiHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 11, 2016 at 5:39 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> I wanted to create a new relopt named something like
> autovacuum_vacuum_pagevisible_factor which would cause autovacuum to
> vacuum a table once less than a certain fraction of the relation's
> pages are marked allvisible.

Interesting idea.

> 1) One issue is that pg_class.relpages and pg_class.relallvisible are
> themselves only updated by vacuum/analyze. In the absence of manual
> vacuum or analyze, this means that if the new criterion uses those
> field, it could only kick in after an autoanalyze has already been
> done, which means that autovacuum_vacuum_pagevisible_factor could not
> meaningfully be set lower than autovacuum_analyze_scale_factor.
>
> Should relallvisible be moved/copied from pg_class to
> pg_stat_all_tables, so that it is maintained by the stats collector?
> Or should the autovacuum worker just walk the vm of every table with a
> defined autovacuum_vacuum_pagevisible_factor each time it is launched
> to get an up-to-date count that way?

relation_needs_vacanalyze has access to Form_pg_class, so it is not a
problem to use the value of relallvisible there to decide if a
vacuum/analyze should be run.

> 2) Should there be a guc in addition to the relopt? I can't think of
> a reason why I would want to set this globally, so I'm happy with just
> a relopt. If it were set globally, it would sure increase the cost
> for scanning the vm for each table once each naptime.

Having a GUC is useful to enforce the default behavior of tables that
do not have this parameter directly set with ALTER TABLE.

> 3) Should there be a autovacuum_vacuum_pagevisible_threshold? The
> other settings have both a factor and a threshold. I've never
> understand what the point of the threshold settings is, but presumably
> there is a point to them. Does that reason also apply to keeping vm
> tuned up?

Having both a threshold and a scale would make the most sense to me.
It may be difficult for the lambda user to tune those parameters using
a number of relation pages. An alternative would be to define those
values in kB, like 32MB worth of pages are marked all visible for
example.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-08-11 06:48:01 Re: Assertion failure in REL9_5_STABLE
Previous Message Michael Paquier 2016-08-11 06:14:45 Re: regression test for extended query protocol