Quick Links

Re: Weak-memory specific problem in ResetLatch/WaitLatch (follow-up analysis)

Lists:	pgsql-hackers

From:	Michael Tautschnig <mt(at)debian(dot)org>
To:	pgsql-hackers(at)postgreSQL(dot)org
Cc:	Jade Alglave <jade(dot)alglave(at)cs(dot)ox(dot)ac(dot)uk>, Vincent Nimal <vincent(dot)nimal(at)balliol(dot)ox(dot)ac(dot)uk>, Daniel Kroening <kroening(at)cs(dot)ox(dot)ac(dot)uk>
Subject:	Weak-memory specific problem in ResetLatch/WaitLatch (follow-up analysis)
Date:	2012-02-29 15:18:54
Message-ID:	20120229151854.GJ92164@l04.local
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi all,

[Bcc'ed Tom Lane as he had done the initial investigation on this.]

Following up on the earlier discussions in

[1] http://archives.postgresql.org/pgsql-hackers/2010-11/msg01575.php
and
[2] http://archives.postgresql.org/pgsql-hackers/2011-08/msg00330.php
with an initial fix in
[3] http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=4e15a4db5e65e43271f8d20750d6500ab12632d0

we (a group of researchers from Oxford and London) did a more formal analysis
using a chain of automated verification tools. This suite of tools enables us to
automatically check for presence or absence of bugs on (smaller) bits of
software when run under weak memory model semantics, as is the case on x86 or
PowerPC. [In this context we are always interested in further source code
examples that make significant use of shared-memory concurrency on such
platforms.]

Using our tool chain, we were eager to (1) confirm the bug and (2) validate the
proposed fix. See the very end for our analysis of the proposed fix.

The example that Tom Lane initially posted in [2] can be further simplified to
the following bit of self-contained (C) code (with line numbers), where
WaitLatch is a simple busy wait as "while(!latch[.]);" and ResetLatch is just
"latch[.]=0;" Then running two of these worker functions in parallel suffices
to reproduce the problem that was initially reported.

1 #define WORKERS 2
2 volatile _Bool latch[WORKERS];
3 volatile _Bool flag[WORKERS];
4
5 void worker(int i)
6 {
7 while(!latch[i]);
8 for(;;)
9 {
10 assert(!latch[i] || flag[i]);
11 latch[i] = 0;
12 if(flag[i])
13 {
14 flag[i] = 0;
15 flag[i+1 % WORKERS] = 1;
16 latch[i+1 % WORKERS] = 1;
17 }
18
19 while(!latch[i]);
20 }
21 }

We developed a tool that is able to analyse a piece of concurrent C code and
determines whether it contains bugs. Our tool confirms the presence of at
least two bugs in the piece of code under discussion.

The first problem corresponds to a message passing idiom (view with fixed-width
font; the info in front of each statement states the line number each
worker/thread is executing):

Worker 0 | Worker 1
===================+========================
(0:15) flag[1]=1; | (1:7) while (!latch[1]);
(0:16) latch[1]=1; | (1:12) if (flag[1])

where we can observe latch[1] holding 1 and flag[1] holding 0 in the end.

This behaviour can happen on the PowerPC architecture for three reasons. First,
write-write pairs can be reordered, for example the write to flag and the write
to latch by Worker 0. Second, read-read pairs can be reordered, for example the
read of the while condition and the read of the if condition by Worker 1.
Finally, the store to latch by Worker 0 might not be atomic.

This corresponds to a scenario where we first execute Worker 0 up to line 17.
All the reads and writes go directly to memory, except the assignments at lines
14 and 15, where the values 0 and 1 are stored respectively into the
processor-local buffers of flag[0] and flag[1]. Then the second worker starts,
reading the freshly updated value 1 of latch[1]. It then exits the blocking
while (line 7). But latch[1] still holds 1, and flag[1] is still holding 0, as
Worker 0 has not flushed yet the write of 1 waiting in its buffer. This means
that the condition of the if statement would not be true, the critical section
would be skipped, and the program would arrive at line 19, without having
authorised the next worker to enter in critical section, and would loop forever.

This seems to correspond to the scenario discussed in [2] above, where the wait
in line 19 times out. This is confirmed by the fact that this behaviour is
commonly observed on several generations of Power machines (e.g. 1.7G/167G on
Power 7).

Let us now focus on the second bug; it corresponds to a load buffering idiom

Worker 0 | Worker 1
=====================+========================
(0:12) if (flag[0]) | (1:12) if (flag[1])
(0:15) flag[1]=1; | (1:15) flag[0]=1;

where we can observe both flag[0] and flag[1] holding 1 in the end.

This behaviour is valid on the PowerPC architecture for two reasons. First,
PowerPC can reorder read-write pairs, hence the read of the if condition on
either thread can be done after setting the flag to 1. Second, the fact
that PowerPC relaxes the atomicity of stores is another reason for this to
happen. Our understanding is that this behaviour is not yet implemented by
real-world Power machines. Yet, since it is documented to be permitted by
the architecture, this could lead to actual bugs on future generations of
Power machines if not synchronised correctly.

In [3] it was suggested to fix the problem by placing a barrier in ResetLatch,
which corresponds to placing it between lines 11 and 12 in the code above. This
amounts to placing a barrier between the two reads (lines 7/19 and 12, i.e.,
between WaitLatch and the if(flag[1]) ) of Worker 1.

Placing a sync (i.e., the strongest Power barrier) accordingly would, however,
still be insufficient for the second problem, as it would only fix the
reordering of read-read pairs by Worker 1 and the store atomicity issue from
Worker 0. But the writes on Worker 0 could still be reordered (problem number
2). One possible fix consists of placing a sync between the two writes on Worker
0, and an address dependency between the two reads on Worker 1. Clearly,
however, these are changes that cannot any longer be hidden behind the
ResetLatch/WaitLatch interface, but rather go in the code using these.

Please let us know if we got into too much detail here and if you have any
questions, or would like further discussion/validation of potential fixes!

Best,
Michael

(on behalf of the team: Jade Alglave, Daniel Kroening, Vincent Nimal, Michael
Tautschnig, as CC'ed)

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Michael Tautschnig <mt(at)debian(dot)org>
Cc:	pgsql-hackers(at)postgresql(dot)org, Jade Alglave <jade(dot)alglave(at)cs(dot)ox(dot)ac(dot)uk>, Vincent Nimal <vincent(dot)nimal(at)balliol(dot)ox(dot)ac(dot)uk>, Daniel Kroening <kroening(at)cs(dot)ox(dot)ac(dot)uk>
Subject:	Re: Weak-memory specific problem in ResetLatch/WaitLatch (follow-up analysis)
Date:	2012-03-21 16:15:17
Message-ID:	CA+TgmobHOisrdUk5EtqAt4ogihfOtGnso1Ekozazvterbq9vDA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Feb 29, 2012 at 10:18 AM, Michael Tautschnig <mt(at)debian(dot)org> wrote:
> In [3] it was suggested to fix the problem by placing a barrier in ResetLatch,
> which corresponds to placing it between lines 11 and 12 in the code above. This
> amounts to placing a barrier between the two reads (lines 7/19 and 12, i.e.,
> between WaitLatch and the if(flag[1]) ) of Worker 1.
>
> Placing a sync (i.e., the strongest Power barrier) accordingly would, however,
> still be insufficient for the second problem, as it would only fix the
> reordering of read-read pairs by Worker 1 and the store atomicity issue from
> Worker 0. But the writes on Worker 0 could still be reordered (problem number
> 2). One possible fix consists of placing a sync between the two writes on Worker
> 0, and an address dependency between the two reads on Worker 1. Clearly,
> however, these are changes that cannot any longer be hidden behind the
> ResetLatch/WaitLatch interface, but rather go in the code using these.

Well, part of my skepticism about Tom's proposal to include memory
barrier instructions in the latch primitives was the fear that
something like what you're suggesting here might be true: namely, that
it might create the illusion of safety for people using the
primitives, when reality thought and possibly additional barrier
instructions might still be needed.

However, your example is enough unlike the actual code that the
conclusion you state following the word "clearly" isn't actually clear
to me. According to latch.h, the correct method of using a latch is
like this:

* for (;;)
* {
* ResetLatch();
* if (work to do)
* Do Stuff();
* WaitLatch();
* }

Meanwhile, anyone who is creating additional work to do should add the
work to the queue and then set the latch.

So it seems to me that we could potentially fix this by inserting
barriers at the end of ResetLatch and at the beginning of SetLatch and
WaitLatch. Then the latch has to get reset before we check whether
there's work to do; and we've got to finish checking for work before
we again try to wait for the latch. Similarly, any work that was in
progress before SetLatch was called will be forced to be committed to
memory before SetLatch does anything else. Adding that many barriers
might not be very good for performance but it seems OK from a
correctness point of view, unless I am missing something, which is
definitely possible. I'd appreciate any thoughts you have on this, as
this is clearly subtle and tricky to get exactly right.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Michael Tautschnig <mt(at)debian(dot)org>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	Jade Alglave <jade(dot)alglave(at)cs(dot)ox(dot)ac(dot)uk>, Vincent Nimal <vincent(dot)nimal(at)balliol(dot)ox(dot)ac(dot)uk>, Daniel Kroening <kroening(at)cs(dot)ox(dot)ac(dot)uk>
Subject:	Re: Weak-memory specific problem in ResetLatch/WaitLatch (follow-up analysis)
Date:	2012-03-21 18:09:35
Message-ID:	20120321180934.GR1837@l04.local
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

[...]
> > Placing a sync (i.e., the strongest Power barrier) accordingly would, however,
> > still be insufficient for the second problem, as it would only fix the
> > reordering of read-read pairs by Worker 1 and the store atomicity issue from
> > Worker 0. But the writes on Worker 0 could still be reordered (problem number
> > 2). One possible fix consists of placing a sync between the two writes on Worker
> > 0, and an address dependency between the two reads on Worker 1. Clearly,
> > however, these are changes that cannot any longer be hidden behind the
> > ResetLatch/WaitLatch interface, but rather go in the code using these.
>
[...]
> However, your example is enough unlike the actual code that the
> conclusion you state following the word "clearly" isn't actually clear
> to me. According to latch.h, the correct method of using a latch is
> like this:
>
> * for (;;)
> * {
> * ResetLatch();
> * if (work to do)
> * Do Stuff();
> * WaitLatch();
> * }
>
> Meanwhile, anyone who is creating additional work to do should add the
> work to the queue and then set the latch.
>

When writing the above statement, including the "clearly", we were possibly too
much thinking of the above usage hint, which just uses ResetLatch and WaitLatch.
As you say, ...

> So it seems to me that we could potentially fix this by inserting
> barriers at the end of ResetLatch and at the beginning of SetLatch and
> WaitLatch. Then the latch has to get reset before we check whether
> there's work to do; and we've got to finish checking for work before
> we again try to wait for the latch. Similarly, any work that was in
> progress before SetLatch was called will be forced to be committed to
> memory before SetLatch does anything else. Adding that many barriers
> might not be very good for performance but it seems OK from a
> correctness point of view, unless I am missing something, which is
> definitely possible. I'd appreciate any thoughts you have on this, as
> this is clearly subtle and tricky to get exactly right.
>

... placing another barrier in "SetLatch" could just do the trick. We will apply
our tools to actually prove this and come back with the conclusive answer.

Best,
Michael

From:	Michael Tautschnig <mt(at)debian(dot)org>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	Jade Alglave <jade(dot)alglave(at)cs(dot)ox(dot)ac(dot)uk>, Vincent Nimal <vincent(dot)nimal(at)balliol(dot)ox(dot)ac(dot)uk>, Daniel Kroening <kroening(at)cs(dot)ox(dot)ac(dot)uk>
Subject:	Re: Weak-memory specific problem in ResetLatch/WaitLatch (follow-up analysis)
Date:	2012-03-24 17:01:32
Message-ID:	20120324170131.GB8779@l04.local
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi again,

[...]
>
> However, your example is enough unlike the actual code that the
> conclusion you state following the word "clearly" isn't actually clear
> to me. According to latch.h, the correct method of using a latch is
> like this:
>
> * for (;;)
> * {
> * ResetLatch();
> * if (work to do)
> * Do Stuff();
> * WaitLatch();
> * }
>
> Meanwhile, anyone who is creating additional work to do should add the
> work to the queue and then set the latch.

When writing the above statement, including the "clearly", we were possibly too
much thinking of the above usage hint, which just uses ResetLatch and WaitLatch,
and not considering that SetLatch is to be part of Do Stuff().

So here are once again our version, and the more properly translated one, this
time including SetLatch (in line 16).

Our version: In PostgreSQL-function terms:

1 #define WORKERS 2 1 #define WORKERS 2
2 volatile _Bool latch[WORKERS]; 2 volatile _Bool latch[WORKERS];
3 volatile _Bool flag[WORKERS]; 3 volatile _Bool flag[WORKERS];
4 4
5 void worker(int i) 5 void worker(int i)
6 { 6 {
7 while(!latch[i]); 7 WaitLatch(i);
8 for(;;) 8 for(;;)
9 { 9 {
10 assert(!latch[i] || flag[i]); 10 assert(!latch[i] || flag[i]);
11 latch[i] = 0; 11 ResetLatch(i);
12 if(flag[i]) 12 if(flag[i])
13 { 13 {
14 flag[i] = 0; 14 flag[i] = 0;
15 flag[i+1 % WORKERS] = 1; 15 flag[i+1 % WORKERS] = 1;
16 latch[i+1 % WORKERS] = 1; 16 SetLatch(i+1 % WORKERS);
17 } 17 }
18 18
19 while(!latch[i]); 19 WaitLatch(i);
20 } 20 }
21 } 21 }

>
> So it seems to me that we could potentially fix this by inserting
> barriers at the end of ResetLatch and at the beginning of SetLatch and
> WaitLatch. Then the latch has to get reset before we check whether
> there's work to do; and we've got to finish checking for work before
> we again try to wait for the latch. Similarly, any work that was in
> progress before SetLatch was called will be forced to be committed to
> memory before SetLatch does anything else. Adding that many barriers
> might not be very good for performance but it seems OK from a
> correctness point of view, unless I am missing something, which is
> definitely possible. I'd appreciate any thoughts you have on this, as
> this is clearly subtle and tricky to get exactly right.
>

So we had suggested the following bugfixes:

>
> > In [3] it was suggested to fix the problem by placing a barrier in ResetLatch,
> > which corresponds to placing it between lines 11 and 12 in the code above. This
> > amounts to placing a barrier between the two reads (lines 7/19 and 12, i.e.,
> > between WaitLatch and the if(flag[1]) ) of Worker 1.
> >
> > Placing a sync (i.e., the strongest Power barrier) accordingly would, however,
> > still be insufficient for the second problem, as it would only fix the
> > reordering of read-read pairs by Worker 1 and the store atomicity issue from
> > Worker 0. But the writes on Worker 0 could still be reordered (problem number
> > 2). One possible fix consists of placing a sync between the two writes on Worker
> > 0, and an address dependency between the two reads on Worker 1. Clearly,
> > however, these are changes that cannot any longer be hidden behind the
> > ResetLatch/WaitLatch interface, but rather go in the code using these.
>
>

Here, "the two writes on Worker 0" corresponds to lines 15 and 16. And indeed
line 16 is exactly the call to SetLatch. For solving problem 1, the mp idiom,
the following options are possible (in all cases stronger synchronisation
primitives may be used, i.e., the strongest Power barrier, sync, may be used, or
lwsync may be used instead of an address dependency):

1. An lwsync at the beginning of SetLatch, and lwsync in ResetLatch (preferably
after the write).
2. An lwsync at the beginning of SetLatch, and an address dependency in
ResetLatch.

To address the second problem, the lb idiom, an address dependency has to be put
either in WaitLatch or SetLatch.

To fix both problems, the performance-wise cheapest option would thus be placing
an address dependency in ResetLatch and an lwsync in SetLatch. For practical
reasons, however, placing an lwsync in both places (at the beginning of SetLatch
and after the write in ResetLatch) might be preferable, as address dependencies
may be optimised away by the C compiler or require inline assembly in a form not
as easy to factor out as lwsync, plus the interface of ResetLatch would have to
be amended.

In summary, we were thus able to show that both points marked with "XXX there
really ought to be a memory barrier" in

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=4e15a4db5e65e43271f8d20750d6500ab12632d0

are the appropriate points to place memory synchronisation primitives, and
picking an lwsync-equivalent in both cases is sound and does not require any
other modifications.

Best,
Michael

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Michael Tautschnig <mt(at)debian(dot)org>
Cc:	pgsql-hackers(at)postgresql(dot)org, Jade Alglave <jade(dot)alglave(at)cs(dot)ox(dot)ac(dot)uk>, Vincent Nimal <vincent(dot)nimal(at)balliol(dot)ox(dot)ac(dot)uk>, Daniel Kroening <kroening(at)cs(dot)ox(dot)ac(dot)uk>
Subject:	Re: Weak-memory specific problem in ResetLatch/WaitLatch (follow-up analysis)
Date:	2012-03-26 14:40:59
Message-ID:	CA+TgmobGdJ_2hC5rQ06MVH7ZtTrp-QEFBt4a+3BEaDKa4+6XbQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sat, Mar 24, 2012 at 1:01 PM, Michael Tautschnig <mt(at)debian(dot)org> wrote:
> Here, "the two writes on Worker 0" corresponds to lines 15 and 16. And indeed
> line 16 is exactly the call to SetLatch. For solving problem 1, the mp idiom,
> the following options are possible (in all cases stronger synchronisation
> primitives may be used, i.e., the strongest Power barrier, sync, may be used, or
> lwsync may be used instead of an address dependency):
>
> 1. An lwsync at the beginning of SetLatch, and lwsync in ResetLatch (preferably
> after the write).
> 2. An lwsync at the beginning of SetLatch, and an address dependency in
> ResetLatch.
>
> To address the second problem, the lb idiom, an address dependency has to be put
> either in WaitLatch or SetLatch.
>
> To fix both problems, the performance-wise cheapest option would thus be placing
> an address dependency in ResetLatch and an lwsync in SetLatch. For practical
> reasons, however, placing an lwsync in both places (at the beginning of SetLatch
> and after the write in ResetLatch) might be preferable, as address dependencies
> may be optimised away by the C compiler or require inline assembly in a form not
> as easy to factor out as lwsync, plus the interface of ResetLatch would have to
> be amended.
>
> In summary, we were thus able to show that both points marked with "XXX there
> really ought to be a memory barrier" in
>
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=4e15a4db5e65e43271f8d20750d6500ab12632d0
>
> are the appropriate points to place memory synchronisation primitives, and
> picking an lwsync-equivalent in both cases is sound and does not require any
> other modifications.

It's interesting that you've concluded that memory barriers are needed
in exactly the two places that Tom concluded they were needed. I
think your analysis of what type of memory barrier is required is
faulty, however. In your version of the code, setting the latch is
represented by a single store. However, SetLatch() is more
complicated than that - it does a load, and depending on the results
of the load, it does a store and then maybe a system call. I suspect
that if you change line 16 to read:

if (!latch[i+1 % WORKERS]) latch[i+1 % WORKERS] = 1;

...then your tool will tell you that you need a full barrier before
that rather than just a store/store barrier.

After staring at this for a while, I am fairly certain that
ResetLatch() needs a full barrier, too. It seems to me that the
formal hazard you're guarding against by inserting an lwsync here is
the load/load dependency between loading the latch and loading the
flag. In your version, that's probably OK, but in real life, it's
not, because WaitLatch() can wake up for a variety of reasons, not
just because the latch has been set. So the following is possible:
worker #1 executes line 14, worker #2 wakes up after line 19 and
executes line 10 and then performs the load on line 12 before the
store on line 11 (since the lwsync you propose doesn't act as a
store/load barrier), worker #1 executes lines 15 and 16, worker #2 now
does the store on line 11. At this point we are hosed, because worker
#2 has clobbered worker #1's attempt to set the latch without seeing
that the flag is set, and everybody goes into the tank and waits
forever (or until worker #2 receives another wake-up from some other
source, at which point we'll be back in business).

I proposed before that a barrier was also needed at the start of
WaitLatch(), to guard against a load/load dependency between loading
the flag (line 12) and loading the latch (line 19). Our version of
WaitLatch doesn't busy-wait, so the concern is approximately that we
could do the load at line 19, conclude that no waiting is needed, then
do the load at line 12, do the store at line 14, and then go to sleep.
That actually can't happen with this exact code, because if we were
to execute line 14 then we'd also hit the proposed lwsync at line 16
and so the loads would happen in order. But lines 15-16 need not be
there: the most common use of this machinery is to hand of requests
from a foreground process to a background process, and the background
process need not be friendly enough to supply a barrier after checking
the flag and before calling WaitLatch(). I think a load/load barrier
here would be enough, but we've actually got a full barrier, because
WaitLatch calls drainSelfPipe() before checking the flag, and as noted
in the comments we assume that a system call acts as a full barrier.
So there's no live bug here, but I think it's worth noting that you
can't conclude that WaitLatch() doesn't need a barrier on the basis of
this simplified example.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company