Re: Autovacuum Improvements

From: Christopher Browne <cbbrowne(at)acm(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Autovacuum Improvements
Date: 2006-12-31 15:06:45
Message-ID: 87vejsf6ka.fsf@wolfe.cbbrowne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

A long time ago, in a galaxy far, far away, alvherre(at)commandprompt(dot)com (Alvaro Herrera) wrote:
> Christopher Browne wrote:
>
>> Seems to me that you could get ~80% of the way by having the
>> simplest "2 queue" implementation, where tables with size < some
>> threshold get thrown at the "little table" queue, and tables above
>> that size go to the "big table" queue.
>>
>> That should keep any small tables from getting "vacuum-starved."
>
> Hmm, would it make sense to keep 2 queues, one that goes through the
> tables in smaller-to-larger order, and the other one in the reverse
> direction?

Interesting approach; that would mean having just one priority queue
for all the work. That seems to simplify things a bit, which is a
good thing.

Unifying policies further might have some merit, too. The worker
processes (that do the vacuuming) could be set up to alternate between
head and tail of the queue. That is, a worker process could vacuum
the littlest table and then go after the biggest table. That way,
they'd eat at both ends towards the middle. Adding more workers could
easily add to the speed at which both ends of the queue get eaten
(assuming you've got the I/O to support having 4 or 5 vacuums running
concurrently).

There is one thing potentially bad, with that; the thing we never want
is for all the workers to get busy on the biggest tables so that
little ones are no longer being serviced. So there needs to be a way
to make sure that there's one worker devoted to "little tables." I
suppose the rule may be that the 1st worker process *never* goes after
the biggest tables.

That ought to be enough to prevent starvation.

> I am currently writing a design on how to create "vacuum queues" but
> I'm thinking that maybe it's getting too complex to handle, and a
> simple idea like yours is enough (given sufficient polish).

There's plenty to like about coming up with a reasonable set of
heuristics...
--
output = ("cbbrowne" "@" "acm.org")
http://linuxdatabases.info/info/slony.html
Rules of the Evil Overlord #191. "I will not appoint a relative to my
staff of advisors. Not only is nepotism the cause of most breakdowns
in policy, but it also causes trouble with the EEOC."
<http://www.eviloverlord.com/>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Nikola Milutinovic 2006-12-31 16:18:17 Re: slow speeds after 2 million rows inserted
Previous Message mike 2006-12-31 06:10:29 Re: COALESCE function

Browse pgsql-hackers by date

  From Date Subject
Next Message mark 2006-12-31 15:42:42 Re: TODO: GNU TLS
Previous Message Martijn van Oosterhout 2006-12-31 14:59:29 Re: TODO: GNU TLS