Re: autovacuum next steps

From: Galy Lee <lee(dot)galy(at)oss(dot)ntt(dot)co(dot)jp>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovacuum next steps
Date: 2007-02-20 02:35:59
Message-ID: 45DA5E8F.2060503@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Gregory Stark wrote:
> If we could have autovacuum interrupt a vacuum in mid-sweep, perform a cycle
> of vacuums on smaller tables, then resume, that problem would go away. That
> sounds too difficult though, but perhaps we could do something nearly as good.

I think to make vacuum has this interrupted-resume capability is quite
useful for large table.

It can provide more flexibility for autovacuum to create a good schedule
scheme. Sometimes it takes a whole day to vacuum the large table
(Hundreds-GB table may qualify); setting the cost_delay make it even
lasts for several days. If the system has maintenance time, vacuum task
of the large table can be split to fit into the maintenance time by
interrupted-resume feature.

> One option that I've heard before is to have vacuum after a single iteration
> (ie, after it fills maintenance_work_mem and does the index cleanup and the
> second heap pass), remember where it was and pick up from that point next
> time.
>

Even a single iteration may take a long time, so it is not so much
useful to have a break in the boundary of the iteration. I think it is
not so difficult to get vacuum to remember where it leaves and start
from where it leaves last time. The following is a basic idea.

A typical vacuum process mainly have the following phases:
Phase 1. scan heap
Phase 2. scan and sweep index
Phase 3. sweep heap
Phase 4. update FSM
Phase 5. truncate CLOG

Where vacuum is interrupted, we can just save the collected information
into the disk, and restore it later when vacuum restarts. When vacuum
process is interrupted, we can remember the dead tuple list and the
block number it has scanned in phase 1; the indexes it has cleanup in
phase 2; the tuples it has swept in phase 3. Before exiting from vacuum,
we can also merge the free space information into FSM.

We are working on this feature now. I will propose it latter to discuss
with you.

Best Regards
Galy Lee
--
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2007-02-20 02:50:07 Re: [HACKERS] \copy (query) delimiter syntax error
Previous Message Bruce Momjian 2007-02-20 01:13:22 Re: ToDo: add documentation for operator IS OF