4. OpenMP Affinity#

OpenMP Affinity consists of a proc_bind policy (thread affinity policy) and a specification of places ( location units or processors that may be cores, hardware threads, sockets, etc.). OpenMP Affinity enables users to bind computations on specific places. The placement will hold for the duration of the parallel region. However, the runtime is free to migrate the OpenMP threads to different cores (hardware threads, sockets, etc.) prescribed within a given place, if two or more cores (hardware threads, sockets, etc.) have been assigned to a given place.

Often the binding can be managed without resorting to explicitly setting places. Without the specification of places in the OMP_PLACES variable, the OpenMP runtime will distribute and bind threads using the entire range of processors for the OpenMP program, according to the OMP_PROC_BIND environment variable or the proc_bind clause. When places are specified, the OMP runtime binds threads to the places according to a default distribution policy, or those specified in the OMP_PROC_BIND environment variable or the proc_bind clause.

In the OpenMP Specifications document a processor refers to an execution unit that is enabled for an OpenMP thread to use. A processor is a core when there is no SMT (Simultaneous Multi-Threading) support or SMT is disabled. When SMT is enabled, a processor is a hardware thread (HW-thread). (This is the usual case; but actually, the execution unit is implementation defined.) Processor numbers are numbered sequentially from 0 to the number of cores less one (without SMT), or 0 to the number HW-threads less one (with SMT). OpenMP places use the processor number to designate binding locations (unless an abstract name is used.)

The processors available to a process may be a subset of the system’s processors. This restriction may be the result of a wrapper process controlling the execution (such as numactl on Linux systems), compiler options, library-specific environment variables, or default kernel settings. For instance, the execution of multiple MPI processes, launched on a single compute node, will each have a subset of processors as determined by the MPI launcher or set by MPI affinity environment variables for the MPI library.

Threads of a team are positioned onto places in a compact manner, a scattered distribution, or onto the primary thread’s place, by setting the OMP_PROC_BIND environment variable or the proc_bind clause to close, spread, or primary (master has been deprecated), respectively. When OMP_PROC_BIND is set to FALSE no binding is enforced; and when the value is TRUE, the binding is implementation defined to a set of places in the OMP_PLACES variable or to places defined by the implementation if the OMP_PLACES variable is not set.

The OMP_PLACES variable can also be set to an abstract name (threads, cores, sockets) to specify that a place is either a single hardware thread, a core, or a socket, respectively. This description of the OMP_PLACES is most useful when the number of threads is equal to the number of hardware thread, cores or sockets. It can also be used with a close or spread distribution policy when the equality doesn’t hold.