Re: [HACKERS] Autovacuum Improvements

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Russell Smith <mr-russ(at)pws(dot)com(dot)au>, Darcy Buskermolen <darcyb(at)commandprompt(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, Pavan Deolasee <pavan(at)enterprisedb(dot)com>, Christopher Browne <cbbrowne(at)acm(dot)org>, pgsql-general(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Autovacuum Improvements
Date: 2007-01-22 15:56:20
Message-ID: 45B4DEA4.6070503@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Bruce Momjian wrote:
> Heikki Linnakangas wrote:
>> Russell Smith wrote:
>>> 2. Index cleanup is the most expensive part of vacuum. So doing a
>>> partial vacuum actually means more I/O as you have to do index cleanup
>>> more often.
>> I don't think that's usually the case. Index(es) are typically only a
>> fraction of the size of the table, and since 8.2 we do index vacuums in
>> a single scan in physical order. In fact, in many applications the index
>> is be mostly cached and the index scan doesn't generate any I/O at all.
>
> Are _all_ the indexes cached? I would doubt that.

Well, depends on your schema, of course. In many applications, yes.

> Also, for typical
> table, what percentage is the size of all indexes combined?

Well, there's no such thing as a typical table. As an anecdote here's
the ratios (total size of all indexes of a table)/(size of corresponding
heap) for the bigger tables for a DBT-2 run I have at hand:

Stock: 1190470/68550 = 6%
Order_line: 950103/274372 = 29%
Customer: 629011 /(5711+20567) = 8%

In any case, for the statement "Index cleanup is the most expensive part
of vacuum" to be true, you're indexes would have to take up 2x as much
space as the heap, since the heap is scanned twice. I'm sure there's
databases like that out there, but I don't think it's the common case.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2007-01-22 15:57:57 Re: CAST function for user defined type
Previous Message Jan Muszynski 2007-01-22 15:55:43 Re: security question

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2007-01-22 16:11:43 Re: savepoint improvements
Previous Message Gregory Stark 2007-01-22 15:53:01 Re: pg_dump ANALYZE statements