Declare Target Directive
Contents
6.13. Declare Target Directive#
6.13.1. Declare Target Directive for a Procedure#
The following example shows how the declare target directive is used to indicate that the corresponding call inside a target region is to a fib function that can execute on the default target device.
A version of the function is also available on the host device. When the if clause conditional expression on the target construct evaluates to false , the target region (thus fib) will execute on the host device.
For the following C/C++ code the declaration of the function fib appears between the begin declare target and end declare target directives. In the corresponding Fortran code, the declare target directive appears at the end of the specification part of the subroutine.
//%compiler: clang
//%cflags: -fopenmp
/*
* name: declare_target.1
* type: C
* version: omp_5.1
*/
#pragma omp begin declare target
extern void fib(int N);
#pragma omp end declare target
#define THRESHOLD 1000000
void fib_wrapper(int n)
{
#pragma omp target if(n > THRESHOLD)
{
fib(n);
}
}
The Fortran fib subroutine contains a declare target declaration to indicate to the compiler to create an device executable version of the procedure. The subroutine name has not been included on the declare target directive and is, therefore, implicitly assumed.
The program uses the module_fib module, which presents an explicit interface to the compiler with the declare target declarations for processing the fib call.
!!%compiler: gfortran
!!%cflags: -fopenmp
! name: declare_target.1
! type: F-free
! version: omp_4.0
module module_fib
contains
subroutine fib(N)
integer :: N
!$omp declare target
!...
end subroutine
end module
module params
integer :: THRESHOLD=1000000
end module
program my_fib
use params
use module_fib
!$omp target if( N > THRESHOLD )
call fib(N)
!$omp end target
end program
The next Fortran example shows the use of an external subroutine. As the subroutine is neither use associated nor an internal procedure, the declare target declarations within a external subroutine are unknown to the main program unit; therefore, a declare target must be provided within the program scope for the compiler to determine that a target binary should be available.
!!%compiler: gfortran
!!%cflags: -fopenmp
! name: declare_target.2
! type: F-free
! version: omp_4.0
program my_fib
integer :: N = 8
interface
subroutine fib(N)
!$omp declare target
integer :: N
end subroutine fib
end interface
!$omp target
call fib(N)
!$omp end target
end program
subroutine fib(N)
integer :: N
!$omp declare target
print*,"hello from fib"
!...
end subroutine
6.13.2. Declare Target Directive for Class Type#
The following example shows the use of the begin declare target and end declare target pair to designate the beginning and end of the affected declarations, as introduced in OpenMP 5.1. The begin declare target directive was defined to symmetrically complement the terminating (βendββ) directive.
The example also shows 3 different ways to use a declare target directive for a class and an external member-function definition (for the XOR1 , XOR2 , and XOR3 classes and definitions for their corresponding foo member functions).
For XOR1 , a begin declare target and end declare target directive enclose both the class and its member function definition. The compiler immediately knows to create a device version of the function for execution in a target region.
For XOR2 , the class member function definition is not specified with a declare target directive. An implicit declare target is created for the member function definition. The same applies if this declaration arrangement for the class and function are included through a header file.
For XOR3 , the class and its member function are not enclosed by begin declare target and end declare target directives, but there is an implicit declare target since the class, its function and the target construct are in the same file scope. That is, the class and its function are treated as if delimited by a declare target directive. The same applies if the class and function are included through a header file.
//%compiler: clang
//%cflags: -fopenmp
/*
* name: declare_target.2a
* type: C++
* version: omp_5.1
*/
#include <iostream>
using namespace std;
#pragma omp begin declare target // declare target--class and function
class XOR1
{
int a;
public:
XOR1(int arg): a(arg) {};
int foo();
}
int XOR1::foo() { return a^0x01;}
#pragma omp end declare target
#pragma omp begin declare target // declare target--class, not function
class XOR2
{
int a;
public:
XOR2(int arg): a(arg) {};
int foo();
};
#pragma omp end declare target
int XOR2::foo() { return a^0x01;}
class XOR3 // declare target--neither class nor function
{
int a;
public:
XOR3(int arg): a(arg) {};
int foo();
};
int XOR3::foo() { return a^0x01;}
int main (){
XOR1 my_XOR1(3);
XOR2 my_XOR2(3);
XOR3 my_XOR3(3);
int res1, res2, res3;
#pragma omp target map(tofrom:res1)
res1=my_XOR1.foo();
#pragma omp target map(tofrom:res2)
res2=my_XOR2.foo();
#pragma omp target map(tofrom:res3)
res3=my_XOR3.foo();
cout << res1 << endl; // OUT1: 2
cout << res2 << endl; // OUT2: 2
cout << res3 << endl; // OUT3: 2
}
Often class definitions and their function definitions are included in separate files, as shown in declare_target.2b_classes.hpp and declare_target.2b_functions.cpp below. In this case, it is necessary to specify in a declare target directive for the classes. However, as long as the 2b_functions.cpp file includes the corresponding declare target classes, there is no need to specify the functions with a declare target directive. The functions are treated as if they are specified with a declare target directive. Compiling the declare_target.2b_functions.cpp and declare_target.2b_main.cpp files separately and linking them, will create appropriate executable device functions for the target device.
//%compiler: clang
//%cflags: -fopenmp
#pragma omp begin declare target
class XOR1
{
int a;
public:
XOR1(int arg): a(arg) {};
int foo();
};
#pragma omp end declare target
//%compiler: clang
//%cflags: -fopenmp
/*
* @@name: declare_target.2b_functions
* @@type: C++
* @@compilable: yes
* @@linkable: no
* @@expect: failure
* @@version: omp_5.1
*/
#include "classes.hpp"
int XOR1::foo() { return a^0x01;}
//%compiler: clang
//%cflags: -fopenmp
/*
* name: declare_target.2b_main
* type: C++
* version: omp_5.1
*/
#include <iostream>
using namespace std;
#include "classes.hpp"
int main (){
XOR1 my_XOR1(3);
int res1;
#pragma omp target map(from: res1)
res1=my_XOR1.foo();
cout << res1 << endl; // OUT1: 2
}
The following example shows how the begin declare target and end declare target directives are used to enclose the declaration of a variable varY with a class type typeY.
This example shows pre-OpenMP 5.0 behavior for the varY.foo() function call (an error). The member function typeY::foo() cannot be accessed on a target device because its declaration does not appear between begin declare target and end declare target directives. As of OpenMP 5.0, the function is implicitly declared with a declare target directive and will successfully execute the function on the device. See previous examples.
//%compiler: clang
//%cflags: -fopenmp
/*
* name: declare_target.2c
* type: C++
* version: omp_5.2
*/
struct typeX
{
int a;
};
class typeY
{
int a;
public:
int foo() { return a^0x01;}
};
#pragma omp begin declare target
struct typeX varX; // ok
class typeY varY; // ok if varY.foo() not called on target device
#pragma omp end declare target
void foo()
{
#pragma omp target
{
varX.a = 100; // ok
varY.foo(); // error foo() is not available on a target device
}
}
6.13.3. Declare Target Directive for Variables#
The following examples show how the declare target directive is used to indicate that global variables are mapped to the implicit device data environment of each target device.
In the following example, the declarations of the variables p , v1 , and v2 appear between begin declare target and end declare target directives indicating that the variables are mapped to the implicit device data environment of each target device. The target update directive is then used to manage the consistency of the variables p , v1 , and v2 between the data environment of the encountering host device task and the implicit device data environment of the default target device.
//%compiler: clang
//%cflags: -fopenmp
/*
* name: declare_target.3
* type: C
* version: omp_5.1
*/
#define N 1000
#pragma omp begin declare target
float p[N], v1[N], v2[N];
#pragma omp end declare target
extern void init(float *, float *, int);
extern void output(float *, int);
void vec_mult()
{
int i;
init(v1, v2, N);
#pragma omp target update to(v1, v2)
#pragma omp target
#pragma omp parallel for
for (i=0; i<N; i++)
p[i] = v1[i] * v2[i];
#pragma omp target update from(p)
output(p, N);
}
The Fortran version of the above C code uses a different syntax. Fortran modules use a list syntax on the declare target directive to declare mapped variables.
!!%compiler: gfortran
!!%cflags: -fopenmp
! name: declare_target.3
! type: F-free
! version: omp_4.0
module my_arrays
!$omp declare target (N, p, v1, v2)
integer, parameter :: N=1000
real :: p(N), v1(N), v2(N)
end module
subroutine vec_mult()
use my_arrays
integer :: i
call init(v1, v2, N);
!$omp target update to(v1, v2)
!$omp target
!$omp parallel do
do i = 1,N
p(i) = v1(i) * v2(i)
end do
!$omp end target
!$omp target update from (p)
call output(p, N)
end subroutine
The following example also indicates that the function Pfun() is available on the target device, as well as the variable Q , which is mapped to the implicit device data environment of each target device. The target update directive is then used to manage the consistency of the variable Q between the data environment of the encountering host device task and the implicit device data environment of the default target device.
In the following example, the function and variable declarations appear between the begin declare target and end declare target directives.
//%compiler: clang
//%cflags: -fopenmp
/*
* name: declare_target.4
* type: C
* version: omp_5.1
*/
#define N 10000
#pragma omp begin declare target
float Q[N][N];
float Pfun(const int i, const int k) { return Q[i][k] * Q[k][i]; }
#pragma omp end declare target
float accum(int k)
{
float tmp = 0.0;
#pragma omp target update to(Q)
#pragma omp target map(tofrom: tmp)
#pragma omp parallel for reduction(+:tmp)
for(int i=0; i < N; i++)
tmp += Pfun(i,k);
return tmp;
}
/* Note: The variable tmp is now mapped with tofrom, for correct
execution with 4.5 (and pre-4.5) compliant compilers.
See Devices Intro.
*/
The Fortran version of the above C code uses a different syntax. In Fortran modules a list syntax on the declare target directive is used to declare mapped variables and procedures. The N and Q variables are declared as a comma separated list. When the declare target directive is used to declare just the procedure, the procedure name need not be listed β it is implicitly assumed, as illustrated in the Pfun() function.
!!%compiler: gfortran
!!%cflags: -fopenmp
! name: declare_target.4
! type: F-free
! version: omp_4.0
module my_global_array
!$omp declare target (N,Q)
integer, parameter :: N=10
real :: Q(N,N)
contains
function Pfun(i,k)
!$omp declare target
real :: Pfun
integer,intent(in) :: i,k
Pfun=(Q(i,k) * Q(k,i))
end function
end module
function accum(k) result(tmp)
use my_global_array
real :: tmp
integer :: i, k
tmp = 0.0e0
!$omp target map(tofrom: tmp)
!$omp parallel do reduction(+:tmp)
do i=1,N
tmp = tmp + Pfun(k,i)
end do
!$omp end target
end function
! Note: The variable tmp is now mapped with tofrom, for correct
! execution with 4.5 (and pre-4.5) compliant compilers. See Devices Intro.
6.13.4. Declare Target Directive with declare simd#
The following example shows how the begin declare target and end declare target directives are used to indicate that a function is available on a target device. The declare simd directive indicates that there is a SIMD version of the function P() that is available on the target device as well as one that is available on the host device.
//%compiler: clang
//%cflags: -fopenmp
/*
* name: declare_target.5
* type: C
* version: omp_5.1
*/
#define N 10000
#define M 1024
#pragma omp begin declare target
float Q[N][N];
#pragma omp declare simd uniform(i) linear(k) notinbranch
float P(const int i, const int k)
{
return Q[i][k] * Q[k][i];
}
#pragma omp end declare target
float accum(void)
{
float tmp = 0.0;
int i, k;
#pragma omp target map(tofrom: tmp)
#pragma omp parallel for reduction(+:tmp)
for (i=0; i < N; i++) {
float tmp1 = 0.0;
#pragma omp simd reduction(+:tmp1)
for (k=0; k < M; k++) {
tmp1 += P(i,k);
}
tmp += tmp1;
}
return tmp;
}
/* Note: The variable tmp is now mapped with tofrom, for correct
execution with 4.5 (and pre-4.5) compliant compilers.
See Devices Intro.
*/
The Fortran version of the above C code uses a different syntax. Fortran modules use a list syntax of the declare target declaration for the mapping. Here the N and Q variables are declared in the list form as a comma separated list. The function declaration does not use a list and implicitly assumes the function name. In this Fortran example row and column indices are reversed relative to the C/C++ example, as is usual for codes optimized for memory access.
!!%compiler: gfortran
!!%cflags: -fopenmp
! name: declare_target.5
! type: F-free
! version: omp_4.0
module my_global_array
!$omp declare target (N,Q)
integer, parameter :: N=10000, M=1024
real :: Q(N,N)
contains
function P(k,i)
!$omp declare simd uniform(i) linear(k) notinbranch
!$omp declare target
real :: P
integer,intent(in) :: k,i
P=(Q(k,i) * Q(i,k))
end function
end module
function accum() result(tmp)
use my_global_array
real :: tmp, tmp1
integer :: i
tmp = 0.0e0
!$omp target map(tofrom: tmp)
!$omp parallel do private(tmp1) reduction(+:tmp)
do i=1,N
tmp1 = 0.0e0
!$omp simd reduction(+:tmp1)
do k = 1,M
tmp1 = tmp1 + P(k,i)
end do
tmp = tmp + tmp1
end do
!$omp end target
end function
! Note: The variable tmp is now mapped with tofrom, for correct
! execution with 4.5 (and pre-4.5) compliant compilers. See Devices Intro.
6.13.5. Declare Target Directive with link Clause#
In the OpenMP 4.5 standard the declare target directive was extended to allow static data to be mapped, when needed, through a link clause.
Data storage for items listed in the link clause becomes available on the device when it is mapped implicitly or explicitly in a map clause, and it persists for the scope of the mapping (as specified by a target construct, a target data construct, or target enter/exit data constructs).
Tip: When all the global data items will not fit on a device and are not needed simultaneously, use the link clause and map the data only when it is needed.
The following C and Fortran examples show two sets of data (single precision and double precision) that are global on the host for the entire execution on the host; but are only used globally on the device for part of the program execution. The single precision data are allocated and persist only for the first target region. Similarly, the double precision data are in scope on the device only for the second target region.
//%compiler: clang
//%cflags: -fopenmp
/*
* name: declare_target.6
* type: C
* version: omp_5.1
*/
#define N 100000000
float sp[N], sv1[N], sv2[N];
double dp[N], dv1[N], dv2[N];
#pragma omp declare target link(sp,sv1,sv2) \
link(dp,dv1,dv2)
void s_init(float *, float *, int);
void d_init(double *, double *, int);
void s_output(float *, int);
void d_output(double *, int);
#pragma omp begin declare target
void s_vec_mult_accum()
{
int i;
#pragma omp parallel for
for (i=0; i<N; i++)
sp[i] = sv1[i] * sv2[i];
}
void d_vec_mult_accum()
{
int i;
#pragma omp parallel for
for (i=0; i<N; i++)
dp[i] = dv1[i] * dv2[i];
}
#pragma omp end declare target
int main()
{
s_init(sv1, sv2, N);
#pragma omp target map(to:sv1,sv2) map(from:sp)
s_vec_mult_accum();
s_output(sp, N);
d_init(dv1, dv2, N);
#pragma omp target map(to:dv1,dv2) map(from:dp)
d_vec_mult_accum();
d_output(dp, N);
return 0;
}
!!%compiler: gfortran
!!%cflags: -fopenmp
! name: declare_target.6
! type: F-free
! version: omp_4.5
module m_dat
integer, parameter :: N=100000000
!$omp declare target link(sp,sv1,sv2)
real :: sp(N), sv1(N), sv2(N)
!$omp declare target link(dp,dv1,dv2)
double precision :: dp(N), dv1(N), dv2(N)
contains
subroutine s_vec_mult_accum()
!$omp declare target
integer :: i
!$omp parallel do
do i = 1,N
sp(i) = sv1(i) * sv2(i)
end do
end subroutine s_vec_mult_accum
subroutine d_vec_mult_accum()
!$omp declare target
integer :: i
!$omp parallel do
do i = 1,N
dp(i) = dv1(i) * dv2(i)
end do
end subroutine
end module m_dat
program prec_vec_mult
use m_dat
call s_init(sv1, sv2, N)
!$omp target map(to:sv1,sv2) map(from:sp)
call s_vec_mult_accum()
!$omp end target
call s_output(sp, N)
call d_init(dv1, dv2, N)
!$omp target map(to:dv1,dv2) map(from:dp)
call d_vec_mult_accum()
!$omp end target
call d_output(dp, N)
end program