Re: autovacuum truncate exclusive lock round two

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Kevin Grittner <kgrittn(at)mail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovacuum truncate exclusive lock round two
Date: 2012-12-04 16:55:48
Message-ID: 50BE2B14.400@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/4/2012 8:06 AM, Kevin Grittner wrote:
> Jan Wieck wrote:
>> I believe the check interval needs to be decoupled from the
>> deadlock_timeout again.
>
> OK
>
>> This will leave us with 2 GUCs at least.
>
> Hmm. What problems do you see with hard-coding reasonable values?

The question is what is reasonable?

Lets talk about the time to (re)acquire the lock first. In the cases
where truncating a table can hurt we are dealing with many gigabytes.
The original vacuumlazy scan of them can take hours if not days. During
that scan the vacuum worker has probably spent many hours napping in the
vacuum delay points. For me 50ms interval for 5 seconds would be
reasonable for (re)acquiring that lock.

The reasoning behind it being that we need some sort of retry mechanism
because if autovacuum just gave up the exclusive lock because someone
needed access, it is more or less guaranteed that the immediate attempt
to reacquire it will fail until that waiter has committed. But if it
can't get a lock after 5 seconds, the system seems busy enough so that
autovacuum should come back much later, when the launcher kicks it off
again.

I don't care much about occupying that autovacuum worker for a few
seconds. It just spent hours vacuuming that very table. How much harm
will a couple more seconds do?

The check interval for the LockHasWaiters() call however depends very
much on the response time constraints of the application. A 200ms
interval for example would cause the truncate phase to hold onto the
exclusive lock for 200ms at least. That means that a steady stream of
short running transactions would see a 100ms "blocking" on average,
200ms max. For many applications that is probably OK. If your response
time constraint is <=50ms on 98% of transactions, you might want to have
that knob though.

I admit I really have no idea what the most reasonable default for that
value would be. Something between 50ms and deadlock_timeout/2 I guess.

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2012-12-04 17:10:25 Re: PageIsAllVisible()'s trustworthiness in Hot Standby
Previous Message Ibrar Ahmed 2012-12-04 16:46:30 Re: Review: create extension default_full_version