From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> |
Subject: | Re: Wait free LW_SHARED acquisition - v0.9 |
Date: | 2014-10-08 19:58:54 |
Message-ID: | 20141008195854.GA23743@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2014-10-08 15:23:22 -0400, Robert Haas wrote:
> On Wed, Oct 8, 2014 at 9:35 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > 1) Convert PGPROC->lwWaitLink into a dlist. The old code was frail and
> > verbose. This also does:
> > * changes the logic in LWLockRelease() to release all shared lockers
> > when waking up any. This can yield some significant performance
> > improvements - and the fairness isn't really much worse than
> > before,
> > as we always allowed new shared lockers to jump the queue.
> >
> > * adds a memory pg_write_barrier() in the wakeup paths between
> > dequeuing and unsetting ->lwWaiting. That was always required on
> > weakly ordered machines, but f4077cda2 made it more urgent. I can
> > reproduce crashes without it.
>
> I think it's a really bad idea to mix a refactoring change (like
> converting PGPROC->lwWaitLink into a dlist) with an attempted
> performance enhancement (like changing the rules for jumping the lock
> queue) and a bug fix (like adding pg_write_barrier where needed). I'd
> suggest that the last of those be done first, and perhaps
> back-patched.
I think it makes sense to separate out the write barrier one. I don't
really see the point of separating the other two changes.
I've indeed previously started a thread
(http://archives.postgresql.org/message-id/20140210134625.GA15246%40awork2.anarazel.de)
about the barrier issue. IIRC you argued that that might be to
expensive.
> The current coding, using a hand-rolled list, touches shared memory
> fewer times. When many waiters are awoken at once, we clip them all
> out of the list at one go. Your revision moves them to a
> backend-private list one at a time, and then pops them off one at a
> time. The backend-private memory accesses don't seem like they matter
> much, but the shared memory accesses would be nice to avoid.
I can't imagine this to matter. We're entering the kernel for each PROC
for the PGSemaphoreUnlock() and we're dirtying the cacheline for
proc->lwWaiting = false anyway. This really is the slow path.
> Does LWLockUpdateVar's wake-up loop need a write barrier per
> iteration, or just one before the loop starts? How about commenting
> the pg_write_barrier() with the read-fence to which it pairs?
Hm. Are you picking out LWLockUpdateVar for a reason or just as an
example? Because I don't see a difference between the different wakeup
loops?
It needs to be a barrier per iteration.
Currently the loop looks like
while (head != NULL)
{
proc = head;
head = proc->lwWaitLink;
proc->lwWaitLink = NULL;
proc->lwWaiting = false;
PGSemaphoreUnlock(&proc->sem);
}
Consider what happens if either the compiler or the cpu reorders this
to:
proc->lwWaiting = false;
head = proc->lwWaitLink;
proc->lwWaitLink = NULL;
PGSemaphoreUnlock(&proc->sem);
as soon as lwWaiting = false, 'proc' can wake up and acquire a new
lock. Backends can wake up prematurely because proc->sem is used for
other purposes than this (that's why the loops around PGSemaphoreLock
exist). Then it could reset lwWaitLink while acquiring a new lock. And
some processes wouldn't be woken up anymore.
The barrier it pairs with is the spinlock acquiration before
requeuing. To be more obviously correct we could add a read barrier
before
if (!proc->lwWaiting)
break;
but I don't think it's needed.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2014-10-08 20:01:53 | Re: Wait free LW_SHARED acquisition - v0.9 |
Previous Message | Andrew Dunstan | 2014-10-08 19:39:37 | Re: Context lenses to set/get values in json values. |