Synchronization Based on Acquire/Release Semantics
9.7. Synchronization Based on Acquire/Release Semantics#
As explained in the Memory Model chapter of this document, a flush operation may be an acquire flush and/or a release flush, and OpenMP 5.0 defines acquire/release semantics in terms of these fundamental flush operations. For any synchronization between two threads that is specified by OpenMP, a release flush logically occurs at the source of the synchronization and an acquire flush logically occurs at the sink of the synchronization. OpenMP 5.0 added memory ordering clauses - acquire, release, and acq_rel - to the flush and atomic constructs for explicitly requesting acquire/release semantics. Furthermore, implicit flushes for all OpenMP constructs and runtime routines that synchronize OpenMP threads in some manner were redefined in terms of synchronizing release and acquire flushes to avoid the requirement of strong memory fences (see the Flush Synchronization and Happens Before and Implicit Flushes sections of the OpenMP Specifications document).
The examples that follow in this section illustrate how acquire and release flushes may be employed, implicitly or explicitly, for synchronizing threads. A flush directive without a list and without any memory ordering clause can also function as both an acquire and release flush for facilitating thread synchronization. Flushes implied on entry to, or exit from, an atomic operation (specified by an atomic construct) may function as an acquire flush or a release flush if a memory ordering clause appears on the construct. On entry to and exit from a critical construct there is now an implicit acquire flush and release flush, respectively.
The first example illustrates how the release and acquire flushes implied by a critical region guarantee a value written by the first thread is visible to a read of the value on the second thread. Thread 0 writes to x and then executes a critical region in which it writes to y ; the write to x happens before the execution of the critical region, consistent with the program order of the thread. Meanwhile, thread 1 executes a critical region in a loop until it reads a non-zero value from y in the critical region, after which it prints the value of x ; again, the execution of the critical regions happen before the read from x based on the program order of the thread. The critical regions executed by the two threads execute in a serial manner, with a pairwise synchronization from the exit of one critical region to the entry to the next critical region. These pairwise synchronizations result from the implicit release flushes that occur on exit from critical regions and the implicit acquire flushes that occur on entry to critical regions; hence, the execution of each critical region in the sequence happens before the execution of the next critical region. A “happens before’’ order is therefore established between the assignment to x by thread 0 and the read from x by thread 1, and so thread 1 must see that x equals 10.
//%compiler: clang
//%cflags: -fopenmp
/*
* name: acquire_release.1
* type: C
* version: omp_5.0
*/
#include <stdio.h>
#include <omp.h>
int main()
{
int x = 0, y = 0;
#pragma omp parallel num_threads(2)
{
int thrd = omp_get_thread_num();
if (thrd == 0) {
x = 10;
#pragma omp critical
{ y = 1; }
} else {
int tmp = 0;
while (tmp == 0) {
#pragma omp critical
{ tmp = y; }
}
printf("x = %d\n", x); // always "x = 10"
}
}
return 0;
}
!!%compiler: gfortran
!!%cflags: -fopenmp
! name: acquire_release.1
! type: F-free
! version: omp_5.0
program rel_acq_ex1
use omp_lib
integer :: x, y, thrd, tmp
x = 0
y = 0
!$omp parallel num_threads(2) private(thrd, tmp)
thrd = omp_get_thread_num()
if (thrd == 0) then
x = 10
!$omp critical
y = 1
!$omp end critical
else
tmp = 0
do while (tmp == 0)
!$omp critical
tmp = y
!$omp end critical
end do
print *, "x = ", x !! always "x = 10"
end if
!$omp end parallel
end program
In the second example, the critical constructs are exchanged with atomic constructs that have explicit memory ordering specified. When the atomic read operation on thread 1 reads a non-zero value from y , this results in a release/acquire synchronization that in turn implies that the assignment to x on thread 0 happens before the read of x on thread 1. Therefore, thread 1 will print “x = 10’’.
//%compiler: clang
//%cflags: -fopenmp
/*
* name: acquire_release.2
* type: C
* version: omp_5.0
*/
#include <stdio.h>
#include <omp.h>
int main()
{
int x = 0, y = 0;
#pragma omp parallel num_threads(2)
{
int thrd = omp_get_thread_num();
if (thrd == 0) {
x = 10;
#pragma omp atomic write release // or seq_cst
y = 1;
} else {
int tmp = 0;
while (tmp == 0) {
#pragma omp atomic read acquire // or seq_cst
tmp = y;
}
printf("x = %d\n", x); // always "x = 10"
}
}
return 0;
}
!!%compiler: gfortran
!!%cflags: -fopenmp
! name: acquire_release.2
! type: F-free
! version: omp_5.0
program rel_acq_ex2
use omp_lib
integer :: x, y, thrd, tmp
x = 0
y = 0
!$omp parallel num_threads(2) private(thrd, tmp)
thrd = omp_get_thread_num()
if (thrd == 0) then
x = 10
!$omp atomic write release ! or seq_cst
y = 1
!$omp end atomic
else
tmp = 0
do while (tmp == 0)
!$omp atomic read acquire ! or seq_cst
tmp = y
!$omp end atomic
end do
print *, "x = ", x !! always "x = 10"
end if
!$omp end parallel
end program
In the third example, atomic constructs that specify relaxed atomic operations are used with explicit flush directives to enforce memory ordering between the two threads. The explicit flush directive on thread 0 must specify a release flush and the explicit flush directive on thread 1 must specify an acquire flush to establish a release/acquire synchronization between the two threads. The flush and atomic constructs encountered by thread 0 can be replaced by the atomic construct used in Example 2 for thread 0, and similarly the flush and atomic constructs encountered by thread 1 can be replaced by the atomic construct used in Example 2 for thread 1.
//%compiler: clang
//%cflags: -fopenmp
/*
* name: acquire_release.3
* type: C
* version: omp_5.0
*/
#include <stdio.h>
#include <omp.h>
int main()
{
int x = 0, y = 0;
#pragma omp parallel num_threads(2)
{
int thrd = omp_get_thread_num();
if (thrd == 0) {
x = 10;
#pragma omp flush // or with acq_rel or release clause
#pragma omp atomic write // or with relaxed clause
y = 1;
} else {
int tmp = 0;
while (tmp == 0) {
#pragma omp atomic read // or with relaxed clause
tmp = y;
}
#pragma omp flush // or with acq_rel or acquire clause
printf("x = %d\n", x); // always "x = 10"
}
}
return 0;
}
!!%compiler: gfortran
!!%cflags: -fopenmp
! name: acquire_release.3
! type: F-free
! version: omp_5.0
program rel_acq_ex3
use omp_lib
integer :: x, y, thrd, tmp
x = 0
y = 0
!$omp parallel num_threads(2) private(thrd, tmp)
thrd = omp_get_thread_num()
if (thrd == 0) then
x = 10
!$omp flush ! or with acq_rel or release clause
!$omp atomic write
y = 1
!$omp end atomic
else
tmp = 0
do while (tmp == 0)
!$omp atomic read
tmp = y
!$omp end atomic
end do
!$omp flush ! or with acq_rel or acquire clause
print *, "x = ", x !! always "x = 10"
end if
!$omp end parallel
end program
Example 4 will fail to order the write to x on thread 0 before the read from x on thread 1. Importantly, the implicit release flush on exit from the critical region will not synchronize with the acquire flush that occurs on the atomic read operation performed by thread 1. This is because implicit release flushes that occur on a given construct may only synchronize with implicit acquire flushes on a compatible construct (and vice-versa) that internally makes use of the same synchronization variable. For a critical construct, this might correspond to a lock object that is used by a given implementation (for the synchronization semantics of other constructs due to implicit release and acquire flushes, refer to the Implicit Flushes section of the OpenMP Specifications document). Either an explicit flush directive that provides a release flush (i.e., a flush without a list that does not have the acquire clause) must be specified between the critical construct and the atomic write, or an atomic operation that modifies y and provides release semantics must be specified.
//%compiler: clang
//%cflags: -fopenmp
/*
* name: acquire_release_broke.4
* type: C
* version: omp_5.0
*/
#include <stdio.h>
#include <omp.h>
int main()
{
// !!! THIS CODE WILL FAIL TO PRODUCE CONSISTENT RESULTS !!!!!!!
// !!! DO NOT PROGRAM SYNCHRONIZATION THIS WAY !!!!!!!
int x = 0, y;
#pragma omp parallel num_threads(2)
{
int thrd = omp_get_thread_num();
if (thrd == 0) {
#pragma omp critical
{ x = 10; }
// an explicit flush directive that provides
// release semantics is needed here
// to complete the synchronization.
#pragma omp atomic write
y = 1;
} else {
int tmp = 0;
while (tmp == 0) {
#pragma omp atomic read acquire // or seq_cst
tmp = y;
}
#pragma omp critical
{ printf("x = %d\n", x); } // !! NOT ALWAYS 10
}
}
return 0;
}
!!%compiler: gfortran
!!%cflags: -fopenmp
! name: acquire_release_broke.4
! type: F-free
! version: omp_5.0
program rel_acq_ex4
use omp_lib
integer :: x, y, thrd
integer :: tmp
x = 0
!! !!! THIS CODE WILL FAIL TO PRODUCE CONSISTENT RESULTS !!!!!!!
!! !!! DO NOT PROGRAM SYNCHRONIZATION THIS WAY !!!!!!!
!$omp parallel num_threads(2) private(thrd) private(tmp)
thrd = omp_get_thread_num()
if (thrd == 0) then
!$omp critical
x = 10
!$omp end critical
! an explicit flush directive that provides
! release semantics is needed here to
! complete the synchronization.
!$omp atomic write
y = 1
!$omp end atomic
else
tmp = 0
do while(tmp == 0)
!$omp atomic read acquire ! or seq_cst
tmp = x
!$omp end atomic
end do
!$omp critical
print *, "x = ", x !! !! NOT ALWAYS 10
!$omp end critical
end if
!$omp end parallel
end program