Re: Cpu usage 100% on slave. s_lock problem.

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Дмитрий Дегтярёв <degtyaryov(at)gmail(dot)com>
Cc: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Cpu usage 100% on slave. s_lock problem.
Date: 2013-08-27 14:12:57
Message-ID: CAHyXU0wavt2APZ89C7x9b3y-p-2trkdk6UpyQotiPXgPVKd_UQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Aug 27, 2013 at 8:38 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Tue, Aug 27, 2013 at 8:23 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> It looks like you're hitting spinlock connection inside
>> heap_page_prune_opt(). Which is commented:
>> * Note: this is called quite often. It's important that it fall out quickly
>> * if there's not any use in pruning.
>>
>> This in turn calls RecoveryInProgress() which spinlocks in order to
>> get a guaranteed result. At that call site, we are told:
>> /*
>> * We can't write WAL in recovery mode, so there's no point trying to
>> * clean the page. The master will likely issue a cleaning WAL record soon
>> * anyway, so this is no particular loss.
>> */
>>
>> So ISTM it's necessary to pedantically check RecoveryInProgress on
>> each and every call of this routine (or at least, we should be able to
>> reduce the number of spinlocks).
>>
>> Hm, what if we exposed LocalRecoveryInProgress() through a function
>> which would approximately satisfy the condition
>> "MightRecoveryInProgress()" in the basis the condition only moves in
>> one direction? That could lead to optimization around the spinlock in
>> hot path cases like this where getting 'TRUE' incorrectly is mostly
>> harmless...
>
> More specifically, this hypothetical routine would query
> xlogctl->SharedRecoveryInProgress without taking a lock and would not
> issue InitXLOGAccess(). RecoveryInProgress() seems to be called
> everywhere (In particular: StartTransaction()) so I don't think
> there's a lot of risk in terms of losing access to the xlog.

Something like the attached. Note, this patch is for research
purposes only and should *not* be applied to your production
environment.

merlin

Attachment Content-Type Size
recovery.patch application/octet-stream 3.7 KB

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2013-08-27 14:57:38 Re: Cpu usage 100% on slave. s_lock problem.
Previous Message Merlin Moncure 2013-08-27 13:38:54 Re: Cpu usage 100% on slave. s_lock problem.