Re: autovacuum truncate exclusive lock round two

From: "Kevin Grittner" <kgrittn(at)mail(dot)com>
To: "Jan Wieck" <JanWieck(at)Yahoo(dot)com>,"Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>,"Amit Kapila" <amit(dot)kapila(at)huawei(dot)com>, "Stephen Frost" <sfrost(at)snowman(dot)net>, "PostgreSQL Development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovacuum truncate exclusive lock round two
Date: 2012-12-09 19:37:41
Message-ID: 20121209193742.142860@gmx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jan Wieck wrote:

> Based on the discussion and what I feel is a consensus I have
> created an updated patch that has no GUC at all. The hard coded
> parameters in include/postmaster/autovacuum.h are
>
>  AUTOVACUUM_TRUNCATE_LOCK_CHECK_INTERVAL 20 /* ms */
>  AUTOVACUUM_TRUNCATE_LOCK_WAIT_INTERVAL 50 /* ms */
>  AUTOVACUUM_TRUNCATE_LOCK_TIMEOUT 5000 /* ms */

Since these really aren't part of the external API and are only
referenced in vacuumlazy.c, it seems more appropriate to define
them there.

> I gave that the worst workload I can think of. A pgbench (style)
> application that throws about 10 transactions per second at it,
> so that there is constantly the need to give up the lock due to
> conflicting lock requests and then reacquiring it again. A
> "cleanup" process is periodically moving old tuples from the
> history table to an archive table, making history a rolling
> window table. And a third job that 2-3 times per minute produces
> a 10 second lasting transaction, forcing autovacuum to give up on
> the lock reacquisition.
>
> Even with that workload autovacuum slow but steady is chopping
> away at the table.

Applies with minor offsets, builds without warning, and passes
`make check-world`. My tests based on your earlier posted test
script confirm the benefit.

There are some minor white-space issues; for example git diff
--color shows some trailing spaces in comments.

There are no docs, but since there are no user-visible changes in
behavior other than better performance and more prompt and reliable
trunction of tables where we were already doing so, it doesn't seem
like any new docs are needed. Due to the nature of the problem,
tests are tricky to run correctly and take a long time to run, so I
don't see how any regressions tests would be appropriate, either.

This patch seems ready for committer, and I would be comfortable
with making the minor changes I mention above and committing it.
I'll wait a day or two to allow any other comments or objections.

To summarize, there has been pathalogical behavior in an
infrequently-encountered corner case of autovacuum, wasting a lot
of resources indefinitely when it is encountered; this patch gives
a major performance improvement in in this situation without any
other user-visible change and without requiring any new GUCs. It
adds a new public function in the locking area to allow a process
to check whether a particular lock it is holding is blocking any
other process, and another to wrap that to make it easy to check
whether the lock held on a particular table is blocking another
process. It uses this new capability to be smarter about scheduling
autovacuum's truncation work, and to avoid throwing away
incremental progress in doing so.

As such, I don't think it would be crazy to back-patch this, but I
think it would be wise to allow it to be proven on master/9.3 for a
while before considering that.

-Kevin

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kohei KaiGai 2012-12-09 19:59:56 Re: [v9.3] OAT_POST_ALTER object access hooks
Previous Message Andres Freund 2012-12-09 19:15:32 Re: [PATCH 02/14] Add support for a generic wal reading facility dubbed XLogReader