Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Subject: Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease
Date: 2014-02-14 15:51:18
Message-ID: 20140214155118.GC20375@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-02-14 15:03:16 +0100, Florian Pflug wrote:
> On Feb14, 2014, at 14:07 , Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > On 2014-02-14 13:52:45 +0100, Florian Pflug wrote:
> >>> I agree we should do that, but imo not in the backbranches. Anything
> >>> more than than the minimal fix in that code should be avoided in the
> >>> stable branches, this stuff is friggin performance sensitive, and the
> >>> spinlock already is a *major* contention point in many workloads.
> >>
> >> No argument there. But unless I missed something, there actually is a bug
> >> there, and using volatile won't fix it. A barrier would, but what about the
> >> back branches that don't have barrier.h?
> >
> > Yea, but I don't see a better alternative. Realistically the likelihood
> > of a problem without the compiler reordering stuff is miniscule on any
> > relevant platform that's supported by the !barrier.h branches. Newer
> > ARMs are the only realistic suspect, and the support for in older
> > releases wasn't so good...
>
> Isn't POWER/PowerPC potentially affected by this?

I wouldn't consider it a major architecture... And I am not sure how
much out of order a CPU has to be to be affected by this, the read side
uses spinlocks, which in most of the spinlock implementations implies a
full memory barrier which in many cache coherency designs will cause
other CPUs to flush writes. And I think the control dependency in the
PGSemaphoreUnlock() loop will actually cause a flush on ppc...

> Also, there is still the alternative of not resetting lwWaitLink in LWLockRelease.
> I can understand why you oppose that - it's certainly nicer to not have stray
> pointer lying around. But since (as least as far as we know)
>
> a) resetting lwWaitLink is not strictly necessary

I don't want to rely on that in the backbranches, that's an entirely new
assumption. I think anything more than minimal changes are hard to
justify for a theoretical issue that hasn't been observed in the field.

I think the biggest danger here is that writes are reordered by the
compiler, that we definitely need to protect against. Making a variable
volatile or introducing a memory barrier is pretty simple to analyze.

> b) resetting lwWaitLink introduces a race condition (however unlikely)
>
> we'll trade correctness for cleanliness if we continue to reset lwWaitLink
> without protecting against the race. That's a bit of a weird trade-off to make.

It's not just cleanliness, it's being able to actually debug crashes.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2014-02-14 16:03:19 Re: HBA files w/include support?
Previous Message Florian Pflug 2014-02-14 15:47:51 Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease