Re: autovacuum next steps, take 2

From: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
To: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
Cc: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, Hackers <pgsql-hackers(at)postgresql(dot)org>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: autovacuum next steps, take 2
Date: 2007-02-22 14:32:57
Message-ID: 45DDA999.9010905@zeut.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jim C. Nasby wrote:
> On Wed, Feb 21, 2007 at 05:40:53PM -0500, Matthew T. O'Connor wrote:
>
>> My Proposal: If we require admins to identify hot tables tables, then:
>> 1) Launcher fires-off a worker1 into database X.
>> 2) worker1 deals with "hot" tables first, then regular tables.
>> 3) Launcher continues to launch workers to DB X every autovac naptime.
>> 4) worker2 (or 3 or 4 etc...) sees it is alone in DB X, if so it acts as
>> worker1 did above. If worker1 is still working in DB X then worker2
>> looks for hot tables that are being starved because worker1 got busy.
>> If worker2 finds no hot tables that need work, then worker2 exits.
>>
>
> Rather than required people to manually identify hot tables, what if we
> just prioritize based on table size? So if a second autovac process hits
> a specific database, it would find the smallest table in need of
> vacuuming that it should be able to complete before the next naptime and
> vacuum that. It could even continue picking tables until it can't find
> one that it could finish within the naptime. Granted, it would have to
> make some assumptions about how many pages it would dirty.
>
> ISTM that's a lot easier than forcing admins to mark specific tables.

So the heuristic would be:
* Launcher fires off workers into a database at a given interval
(perhaps configurable?)
* Each worker works on tables in size order.
* If a worker ever catches up to an older worker, then the younger
worker exits.

This sounds simple and workable to me, perhaps we can later modify this
to include some max_workers variable so that a worker would only exit if
it catches an older worker and there are max_workers currently active.

Thoughts?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2007-02-22 14:39:27 Re: What is CheckPoint.undo needed for?
Previous Message Andrew Dunstan 2007-02-22 14:24:03 Re: SCMS question