4.2. OpenMP Device Constructs#

OpenMP provides a set of device constructs that allow programmers to offload computations to GPU devices. These constructs are designed to define regions of code and data that should be executed on the device, manage data transfers between the host and the device, and synchronize the execution. In this section, we will explore the key device constructs in OpenMP: target, target data, target update, target enter data, and target exit data.

4.2.1. target directive#

The target directive is the primary construct for offloading a region of code to a GPU device. It specifies that the associated code block should be executed on the device. The syntax of the target directive is as follows:

#pragma omp target [clause[[,] clause]...]
{
  // Code to be executed on the device
}

The target directive supports various clauses to control data mapping, device selection, and synchronization. Some commonly used clauses include:

  • map: Specifies how data should be mapped between the host and the device.

  • device: Specifies the device on which the code should be executed.

  • nowait: Allows asynchronous execution of the target region.

Example:

#pragma omp target map(to: a[0:n], b[0:n]) map(from: c[0:n])
{
  #pragma omp parallel for
  for (int i = 0; i < n; i++) {
    c[i] = a[i] + b[i];
  }
}

In this example, arrays a and b are mapped to the device, and array c is mapped back from the device after the computation.

4.2.2. target data directive#

The target data directive is used to define a region of code in which data is mapped to and from the device. It allows for data persistence across multiple target regions, reducing the overhead of data transfers. The syntax of the target data directive is as follows:

#pragma omp target data [clause[[,] clause]...]
{
  // Code region with mapped data
}

The target data directive supports clauses similar to the target directive, such as map and device, to control data mapping and device selection.

Example:

#pragma omp target data map(to: a[0:n], b[0:n]) map(from: c[0:n])
{
  #pragma omp target
  {
    #pragma omp parallel for
    for (int i = 0; i < n; i++) {
      c[i] = a[i] + b[i];
    }
  }
  
  #pragma omp target
  {
    #pragma omp parallel for
    for (int i = 0; i < n; i++) {
      c[i] *= 2;
    }
  }
}

In this example, the target data directive maps arrays a, b, and c to the device. The mapped data persists across the two target regions, avoiding redundant data transfers.

4.2.3. target update directive#

The target update directive is used to synchronize data between the host and the device outside of a target region. It allows for updating specific variables or memory regions on the device or host. The syntax of the target update directive is as follows:

#pragma omp target update [clause[[,] clause]...]

The target update directive supports clauses such as to and from to specify the direction of the data update.

Example:

#pragma omp target data map(to: a[0:n], b[0:n]) map(from: c[0:n])
{
  #pragma omp target
  {
    #pragma omp parallel for
    for (int i = 0; i < n; i++) {
      c[i] = a[i] + b[i];
    }
  }
  
  // Modify 'a' on the host
  for (int i = 0; i < n; i++) {
    a[i] *= 2;
  }
  
  #pragma omp target update to(a[0:n])
  
  #pragma omp target
  {
    #pragma omp parallel for
    for (int i = 0; i < n; i++) {
      c[i] += a[i];
    }
  }
}

In this example, after modifying array a on the host, the target update directive is used to update the corresponding data on the device before the second target region.

4.2.4. target enter data and target exit data directives#

The target enter data and target exit data directives are used to manage the lifetime of data on the device. They allow for allocating and deallocating memory on the device, as well as initializing and updating data.

The target enter data directive is used to allocate memory and transfer data to the device. It has the following syntax:

#pragma omp target enter data [clause[[,] clause]...]

The target exit data directive is used to deallocate memory and transfer data back from the device. It has the following syntax:

#pragma omp target exit data [clause[[,] clause]...]

Both directives support clauses such as map and device to control data mapping and device selection.

Example:

#pragma omp target enter data map(to: a[0:n], b[0:n])

#pragma omp target
{
  #pragma omp parallel for
  for (int i = 0; i < n; i++) {
    c[i] = a[i] + b[i];
  }
}

#pragma omp target exit data map(from: c[0:n])

In this example, the target enter data directive allocates memory on the device and transfers arrays a and b to the device. After the target region, the target exit data directive transfers array c back from the device and deallocates the device memory.

These device constructs provide a flexible and powerful way to offload computations to GPU devices using OpenMP. By combining these constructs and their associated clauses, programmers can efficiently manage data transfers, device memory, and execution on GPU devices.

In the next section, we will explore the various ways to map data between the host and the device using the map clause and other data mapping techniques in OpenMP.