Re: autovacuum truncate exclusive lock round two

Lists: pgsql-hackers
From: "Kevin Grittner" <kgrittn(at)mail(dot)com>
To: "Jan Wieck" <JanWieck(at)Yahoo(dot)com>
Cc: "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>,"Amit Kapila" <amit(dot)kapila(at)huawei(dot)com>, "Stephen Frost" <sfrost(at)snowman(dot)net>, "PostgreSQL Development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovacuum truncate exclusive lock round two
Date: 2012-12-03 22:42:04
Message-ID: 20121203224204.69310@gmx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jan Wieck wrote:

> Attached is a new patch that addresses most of the points raised
> in discussion before.
>
> 1) Most of the configuration variables are derived from
> deadlock_timeout now. The "check for conflicting lock request"
> interval is deadlock_timeout/10, clamped to 10ms. The "try to
> acquire exclusive lock" interval is deadlock_timeout/20, also
> clamped to 10ms. The only GUC variable remaining is
> autovacuum_truncate_lock_try=2000ms with a range from 0 (just try
> once) to 20000ms.

If we're going to keep this GUC, we need docs for it.

> I'd like to point out that this is a significant change in
> functionality as without the config option for the check
> interval, there is no longer any possibility to disable the call
> to LockHasWaiters() and return to the original (deadlock code
> kills autovacuum) behavior.

Arguably we could simplify the deadlock resolution code a little,
but it seems like it is probably safer to leave it as a failsafe,
at least for now.

> I did not touch the part about suppressing the stats and the
> ANALYZE step of "auto vacuum+analyze". The situation is no
> different today. When the deadlock code kills autovacuum, stats
> aren't updated either. And this patch is meant to cause
> autovacuum to finish the truncate in a few minutes or hours, so
> that the situation fixes itself.

Unless someone want to argue for more aggressive updating of the
stats, I guess I'll yield to that argument.

Besides the documentation, the only thing I could spot on this
go-around was that this comment probably no longer has a reason for
being since this is no longer conditional and what it's doing is
obvious from the code:

   /* Initialize the starttime if we check for conflicting lock requests */

With docs and dropping the obsolete comment, I think this will be
ready.

-Kevin


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Kevin Grittner <kgrittn(at)mail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovacuum truncate exclusive lock round two
Date: 2012-12-03 23:08:22
Message-ID: 50BD30E6.3030708@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/3/2012 5:42 PM, Kevin Grittner wrote:
> Jan Wieck wrote:
>
>> Attached is a new patch that addresses most of the points raised
>> in discussion before.
>>
>> 1) Most of the configuration variables are derived from
>> deadlock_timeout now. The "check for conflicting lock request"
>> interval is deadlock_timeout/10, clamped to 10ms. The "try to
>> acquire exclusive lock" interval is deadlock_timeout/20, also
>> clamped to 10ms. The only GUC variable remaining is
>> autovacuum_truncate_lock_try=2000ms with a range from 0 (just try
>> once) to 20000ms.
>
> If we're going to keep this GUC, we need docs for it.

Certainly. But since we're still debating which and how many GUC
variables we want, I don't think doc-time has come yet.

>
>> I'd like to point out that this is a significant change in
>> functionality as without the config option for the check
>> interval, there is no longer any possibility to disable the call
>> to LockHasWaiters() and return to the original (deadlock code
>> kills autovacuum) behavior.
>
> Arguably we could simplify the deadlock resolution code a little,
> but it seems like it is probably safer to leave it as a failsafe,
> at least for now.

Thinking about it, I'm not really happy with removing the
autovacuum_truncate_lock_check GUC at all.

Fact is that the deadlock detection code and the configuration parameter
for it should IMHO have nothing to do with all this in the first place.
A properly implemented application does not deadlock. Someone running
such a properly implemented application should be able to safely set
deadlock_timeout to minutes without the slightest ill side effect, but
with the benefit that the deadlock detection code itself does not add to
the lock contention. The only reason one cannot do so today is because
autovacuum's truncate phase could then freeze the application with an
exclusive lock for that long.

I believe the check interval needs to be decoupled from the
deadlock_timeout again. This will leave us with 2 GUCs at least.

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin