Re: [HACKERS] Autovacuum Improvements

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, "Russell Smith" <mr-russ(at)pws(dot)com(dot)au>, "Darcy Buskermolen" <darcyb(at)commandprompt(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, "Pavan Deolasee" <pavan(at)enterprisedb(dot)com>, "Christopher Browne" <cbbrowne(at)acm(dot)org>, <pgsql-general(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Autovacuum Improvements
Date: 2007-01-22 17:41:06
Message-ID: 1169487666.3776.368.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Mon, 2007-01-22 at 12:18 -0500, Bruce Momjian wrote:
> Heikki Linnakangas wrote:
> >
> > In any case, for the statement "Index cleanup is the most expensive part
> > of vacuum" to be true, you're indexes would have to take up 2x as much
> > space as the heap, since the heap is scanned twice. I'm sure there's
> > databases like that out there, but I don't think it's the common case.
>
> I agree it index cleanup isn't > 50% of vacuum. I was trying to figure
> out how small, and it seems about 15% of the total table, which means if
> we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps
> 80%, assuming 5% of the table is scanned.

Clearly keeping track of what needs vacuuming will lead to a more
efficient VACUUM. Your math applies to *any* design that uses some form
of book-keeping to focus in on the hot spots.

On a separate thread, Heikki has raised a different idea for VACUUM.

Heikki's idea asks an important question: where and how should DSM
information be maintained? Up to now everybody has assumed that it would
be maintained when DML took place and that the DSM would be a
transactional data structure (i.e. on-disk). Heikki's idea requires
similar bookkeeping requirements to the original DSM concept, but the
interesting aspect is that the DSM information is collected off-line,
rather than being an overhead on every statement's response time.

That idea seems extremely valuable to me.

One of the main challenges is how we cope with large tables that have a
very fine spray of updates against them. A DSM bitmap won't help with
that situation, regrettably.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Gregory Stark 2007-01-22 17:51:53 Re: [HACKERS] Autovacuum Improvements
Previous Message Tony Caduto 2007-01-22 17:30:03 Re: MSSQL/ASP migration

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2007-01-22 17:51:53 Re: [HACKERS] Autovacuum Improvements
Previous Message Bruce Momjian 2007-01-22 17:18:46 Re: [HACKERS] Autovacuum Improvements