Re: Cpu usage 100% on slave. s_lock problem.

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Дмитрий Дегтярёв <degtyaryov(at)gmail(dot)com>, postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Cpu usage 100% on slave. s_lock problem.
Date: 2013-09-26 23:08:11
Message-ID: 20130926230811.GA29658@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 2013-08-27 12:17:55 -0500, Merlin Moncure wrote:
> On Tue, Aug 27, 2013 at 10:55 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > On 2013-08-27 09:57:38 -0500, Merlin Moncure wrote:
> >> + bool
> >> + RecoveryMightBeInProgress(void)
> >> + {
> >> + /*
> >> + * We check shared state each time only until we leave recovery mode. We
> >> + * can't re-enter recovery, so there's no need to keep checking after the
> >> + * shared variable has once been seen false.
> >> + */
> >> + if (!LocalRecoveryInProgress)
> >> + return false;
> >> + else
> >> + {
> >> + /* use volatile pointer to prevent code rearrangement */
> >> + volatile XLogCtlData *xlogctl = XLogCtl;
> >> +
> >> + /* Intentionally query xlogctl without spinlocking! */
> >> + LocalRecoveryInProgress = xlogctl->SharedRecoveryInProgress;
> >> +
> >> + return LocalRecoveryInProgress;
> >> + }
> >> + }
> >
> > I don't think it's acceptable to *set* LocalRecoveryInProgress
> > here. That should only be done in the normal routine.
>
> quite right -- that was a major error -- you could bypass the
> initialization call to the xlog with some bad luck.

I've seen this in profiles since, so I'd appreciate pushing this
forward.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2013-09-27 03:14:10 Re: Cpu usage 100% on slave. s_lock problem.
Previous Message Andres Freund 2013-09-25 18:43:30 Re: Planner performance extremely affected by an hanging transaction (20-30 times)?