Deadline-Based Vacuum Delay

Lists: pgsql-hackers
From: Galy Lee <lee(dot)galy(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Deadline-Based Vacuum Delay
Date: 2006-12-28 07:09:18
Message-ID: 45936D9E.1070108@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I'd like to propose a new feature, Deadline-Based Vacuum Delay, the
syntax is something like "VACUUM IN 6 HOURS".

Vacuum is a non-trivial task to be performed. The database needs to be
vacuumed before the system performance suffers from the garbage; it also
needs to ensure the system won't be hammered during the vacuum operation.

The cost-based delay vacuum can reduce the impact of the disk I/O storm,
but sometimes it can last for several hours and it is hard to know when
it will end. So there are many complains about the unpredictable
execution time of vacuum.

On the other hand, several users want to run vacuum in their maintenance
window. Also, autovacuum kicks vacuum when a certain amount of garbage
has been generated. It restricts the interval of each vacuum for a relation.

So I am thinking another way to perform vacuum. For example vacuum can
be refined in a maintenance time frame like "VACUUM IN 6 HOURS", and
then vacuum operation will be performed within the window. The delay
time is adjusted internally to spread the disk I/O over the time frame.
This may make vacuum more predictable, also ensures vacuum doesn't have
side effect on the producing system outside the maintenance window.

Any ideas or comments?

Best Regards,
--
Galy Lee
NTT Open Source Software Center


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Galy Lee <lee(dot)galy(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deadline-Based Vacuum Delay
Date: 2006-12-28 15:01:25
Message-ID: 6980.1167318085@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Galy Lee <lee(dot)galy(at)oss(dot)ntt(dot)co(dot)jp> writes:
> So I am thinking another way to perform vacuum. For example vacuum can
> be refined in a maintenance time frame like "VACUUM IN 6 HOURS", and
> then vacuum operation will be performed within the window. The delay
> time is adjusted internally to spread the disk I/O over the time frame.

And you will manage that how? The total amount of work to be done is
quite unpredictable.

regards, tom lane


From: "Jaime Casanova" <systemguards(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Galy Lee" <lee(dot)galy(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deadline-Based Vacuum Delay
Date: 2006-12-29 00:27:48
Message-ID: c2d9e70e0612281627p67a5bebeu953905ba13a765af@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/28/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Galy Lee <lee(dot)galy(at)oss(dot)ntt(dot)co(dot)jp> writes:
> > So I am thinking another way to perform vacuum. For example vacuum can
> > be refined in a maintenance time frame like "VACUUM IN 6 HOURS", and
> > then vacuum operation will be performed within the window. The delay
> > time is adjusted internally to spread the disk I/O over the time frame.
>
> And you will manage that how? The total amount of work to be done is
> quite unpredictable.
>

specially for something you already can do with cron (*nix) or job
scheduler (windows)

--
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."
Richard Cook


From: Chris Browne <cbbrowne(at)acm(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deadline-Based Vacuum Delay
Date: 2006-12-29 17:30:28
Message-ID: 60y7oqzk23.fsf@dba2.int.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

systemguards(at)gmail(dot)com ("Jaime Casanova") writes:
> On 12/28/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Galy Lee <lee(dot)galy(at)oss(dot)ntt(dot)co(dot)jp> writes:
>> > So I am thinking another way to perform vacuum. For example vacuum can
>> > be refined in a maintenance time frame like "VACUUM IN 6 HOURS", and
>> > then vacuum operation will be performed within the window. The delay
>> > time is adjusted internally to spread the disk I/O over the time frame.
>>
>> And you will manage that how? The total amount of work to be done is
>> quite unpredictable.
>
> specially for something you already can do with cron (*nix) or job
> scheduler (windows)

That seems like a nonsequitor here...

Using cron to try to "make vacuums spread over 6 hours" seems to me
like an attempt to try to do that in as terrible a way possible.

If you're trying to spread work over time, you need to use something
that actively monitors the results, as opposed to a tool that is
entirely "fire and forget" (and perhaps forget to work), like cron.

It seems to me that the answer is much more along the lines of making
a greatly more intelligent autovacuum. Something offering:
- Queueing work (e.g. - in this case, we want to schedule a vacuum
of "everything")
- Processing that work; perhaps sometimes with multiple threads to
do multiple vacuums
- Perhaps using time estimates to determine any vacuum delay
GUC values to be applied

How you get the work to spread consistently across 6 hours is a
challenge; personally, my preference would generally be to try to get
the work done ASAP, so the goal seems a tad off to me...
--
let name="cbbrowne" and tld="linuxdatabases.info" in String.concat "@" [name;tld];;
http://www3.sympatico.ca/cbbrowne/x.html
The human race will decree from time to time: "There is something at
which it is absolutely forbidden to laugh."
-- Nietzche on Common Lisp


From: "Jaime Casanova" <systemguards(at)gmail(dot)com>
To: "Chris Browne" <cbbrowne(at)acm(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deadline-Based Vacuum Delay
Date: 2006-12-29 23:42:10
Message-ID: c2d9e70e0612291542vc073ba2kadabc6535c09ec30@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/29/06, Chris Browne <cbbrowne(at)acm(dot)org> wrote:
> systemguards(at)gmail(dot)com ("Jaime Casanova") writes:
> > On 12/28/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Galy Lee <lee(dot)galy(at)oss(dot)ntt(dot)co(dot)jp> writes:
> >> > So I am thinking another way to perform vacuum. For example vacuum can
> >> > be refined in a maintenance time frame like "VACUUM IN 6 HOURS", and
> >> > then vacuum operation will be performed within the window. The delay
> >> > time is adjusted internally to spread the disk I/O over the time frame.
> >>
> >> And you will manage that how? The total amount of work to be done is
> >> quite unpredictable.
> >
> > specially for something you already can do with cron (*nix) or job
> > scheduler (windows)
>
> That seems like a nonsequitor here...
>
> Using cron to try to "make vacuums spread over 6 hours" seems to me
> like an attempt to try to do that in as terrible a way possible.
>

that's entirely my point... what th op was proposing was a command
"VACUUM IN n HOURS"... he want to write the command... he didn't talk
about a tool for monitoring jus a tool that he "fire and forget"... at
least that was the way a read it...

--
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."
Richard Cook


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Chris Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deadline-Based Vacuum Delay
Date: 2006-12-30 06:48:36
Message-ID: 4550.1167461316@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Chris Browne <cbbrowne(at)acm(dot)org> writes:
> How you get the work to spread consistently across 6 hours is a
> challenge; personally, my preference would generally be to try to get
> the work done ASAP, so the goal seems a tad off to me...

I think the context for this is that you have an agreed-on maintenance
window, say extending from 2AM to 6AM local time, and you want to get
all your vacuuming done in that window without undue spikes in the
system load (because you do still have live users then, just not as many
as during prime time). If there were a decent way to estimate the
amount of work to be done then it'd be possible to spread the work
fairly evenly across the window. What I do not see is where you get
that estimate from --- especially since you probably have more than one
table to vacuum in your window.

The other problem is that "vacuum only during a maintenance window"
doesn't seem all that compelling a policy anyway. We see a lot of
examples of tables that need to be vacuumed much more often than once
a day. So I'd rather put effort into making sure that vacuum can be run
in the background even under high load, instead of designing around a
maintenance-window assumption.

regards, tom lane


From: Jim Nasby <decibel(at)decibel(dot)org>
To: Chris Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deadline-Based Vacuum Delay
Date: 2007-01-03 15:26:34
Message-ID: 92B73DA8-A02A-4FE3-B0F3-D276A3C54F5C@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Dec 29, 2006, at 12:30 PM, Chris Browne wrote:
> How you get the work to spread consistently across 6 hours is a
> challenge; personally, my preference would generally be to try to get
> the work done ASAP, so the goal seems a tad off to me...

Agreed. If we're going to monkey with automatically setting vacuum
cost GUCs I'd *much* rather work towards having some kind of I/O
priority scheme; that would allow vacuum to use as much I/O as it
wants, provided nothing else in the system needs it.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: Galy Lee <lee(dot)galy(at)oss(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deadline-Based Vacuum Delay
Date: 2007-01-05 09:45:45
Message-ID: 459E1E49.7030203@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> I think the context for this is that you have an agreed-on maintenance
> window, say extending from 2AM to 6AM local time, and you want to get
> all your vacuuming done in that window without undue spikes in the
> system load (because you do still have live users then, just not as many
> as during prime time). If there were a decent way to estimate the
> amount of work to be done then it'd be possible to spread the work
> fairly evenly across the window. What I do not see is where you get
> that estimate from --- especially since you probably have more than one
> table to vacuum in your window.

It is true that there is not a decent way to estimate the amount of work
to be done. But the purpose in here is not “spread the vacuum over 6
hours exactly”, it is “finish vacuum within 6 hours, and spread the
spikes as much as possible”. So the maximum estimation of the work is
enough to refine the vacuum within the window, it is fine if vacuum run
quickly than schedule. Also we don’t need to estimate the time of
vacuum, we only need to compare the actual progress of time window and
the progress of the work, and then adjust them to have the same pace in
the delay point.

The maximum of the work of vacuum can be estimated by size of the heap,
the size of the index, and the number of dead tuples. For example the
lazy vacuum has the following works:
1. scan heap
2. vacuum index
3. vacuum heap
4. truncate heap
Although 2 and 4 are quite unpredictable, but the total amount of work
including 1, 2, 3, and 4 can be estimated.

> The other problem is that "vacuum only during a maintenance window"
> doesn't seem all that compelling a policy anyway. We see a lot of
> examples of tables that need to be vacuumed much more often than once
> a day. So I'd rather put effort into making sure that vacuum can be run
> in the background even under high load, instead of designing around a
> maintenance-window assumption.

This feature is not necessary has a maintenance window assumption. For
example, if a table needs to be vacuumed every 3 hours to sweep the
garbage, then instead of tuning cost delay GUC hardly to refine vacuum
in 3 hours, we can make vacuum finish within the time frame by “VACUUM
IN time” feature.

If we can find a good way to tune the cost delay GUC to enable vacuum to
catch up with the speed of garbage generation in the high frequency
update system, then we won’t need this feature. For example, the
interval of two vacuums can be estimated by tracking the speed of the
dead tuple generation, but how can you tune the vacuum time to fit in
the interval of two vacuums? It seems that there is not easy to tune the
delay time of vacuum correctly.

Best Regards
--
Galy Lee <lee.galy _at_ oss.ntt.co.jp>
NTT Open Source Software Center


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Galy Lee <lee(dot)galy(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deadline-Based Vacuum Delay
Date: 2007-01-05 14:54:15
Message-ID: 15581.1168008855@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Galy Lee <lee(dot)galy(at)oss(dot)ntt(dot)co(dot)jp> writes:
> It is true that there is not a decent way to estimate the amount of work
> to be done. But the purpose in here is not spread the vacuum over 6
> hours exactly, it is finish vacuum within 6 hours, and spread the
> spikes as much as possible. So the maximum estimation of the work is
> enough to refine the vacuum within the window, it is fine if vacuum run
> quickly than schedule.

Is it? If I tell the thing to take 6 hours and it finishes in 5
minutes, why would I be happy? It could obviously have spread out the
work more, and presumably if I'm using this feature at all then I want
the least possible load added from vacuum while it's running.

But this is all academic, because there's no way to produce a
trustworthy "maximum estimate" either.

regards, tom lane