7.4. ref, val, uval Modifiers for linear Clause#

When generating vector functions from declare simd directives, it is important for a compiler to know the proper types of function arguments in order to generate efficient codes. This is especially true for C++ reference types and Fortran arguments.

In the following example, the function add_one2 has a C++ reference parameter (or Fortran argument) p . Variable p gets incremented by 1 in the function. The caller loop i in the main program passes a variable k as a reference to the function add_one2 call. The ref modifier for the linear clause on the declare simd directive specifies that the reference-type parameter p is to match the property of the variable k in the loop. This use of reference type is equivalent to the second call to add_one2 with a direct passing of the array element a[i] . In the example, the preferred vector length 8 is specified for both the caller loop and the callee function.

When linear(p: ref) is applied to an argument passed by reference, it tells the compiler that the addresses in its vector argument are consecutive, and so the compiler can generate a single vector load or store instead of a gather or scatter. This allows more efficient SIMD code to be generated with less source changes.

//%compiler: clang
//%cflags: -fopenmp

/*
* name: linear_modifier.1
* type: C++
* version: omp_5.1
*/
#include <stdio.h>

#define NN 1023
int a[NN];

#pragma omp declare simd linear(p: ref) simdlen(8)
void add_one2(int& p)
{
   p += 1;
}

int main(void)
{
   int i;
   int* p = a;

   for (i = 0; i < NN; i++) {
      a[i] = i;
   }

#pragma omp simd linear(p) simdlen(8)
   for (i = 0; i < NN; i++) {
      int& k = *p;
      add_one2(k);
      add_one2(a[i]);
      p++;
   }

   for (i = 0; i < NN; i++) {
      if (a[i] != i+2) {
         printf("failed\n");
         return 1;
      }
   }
   printf("passed\n");
   return 0;
}
!!%compiler: gfortran
!!%cflags: -fopenmp

! name: linear_modifier.1
! type: F-free
! version: omp_5.2
module m
   integer, parameter :: NN = 1023
   integer :: a(NN)

 contains
   subroutine add_one2(p)
   !$omp declare simd(add_one2) linear(p: ref) simdlen(8)
   implicit none
   integer :: p

   p = p + 1
   end subroutine
end module

program main
   use m
   implicit none
   integer :: i, p

   do i = 1, NN
      a(i) = i
   end do

   p = 1
   !$omp simd linear(p) simdlen(8)
   do i = 1, NN
      associate(k => a(p))
         call add_one2(k)
      end associate
      call add_one2(a(i))
      p = p + 1
   end do

   do i = 1, NN
      if (a(i) /= i+2) then
         print *, "failed"
         stop
      endif
   end do
   print *, "passed"
end program

The following example is a variant of the above example. The function add_one2 in the C++ code includes an additional C++ reference parameter i . The loop index i of the caller loop i in the main program is passed as a reference to the function add_one2 call. The loop index i has a uniform address with linear value of step 1 across SIMD lanes. Thus, the uval modifier is used for the linear clause to specify that the C++ reference-type parameter i is to match the property of loop index i .

In the corresponding Fortran code the arguments p and i in the routine add_on2 are passed by references. Similar modifiers are used for these variables in the linear clauses to match with the property at the caller loop in the main program.

When linear(i: uval) is applied to an argument passed by reference, it tells the compiler that its addresses in the vector argument are uniform so that the compiler can generate a scalar load or scalar store and create linear values. This allows more efficient SIMD code to be generated with less source changes.

//%compiler: clang
//%cflags: -fopenmp

/*
* name: linear_modifier.2
* type: C++
* version: omp_5.2
*/
#include <stdio.h>

#define NN 1023
int a[NN];

#pragma omp declare simd linear(p: ref) linear(i: uval)
void add_one2(int& p, const int& i)
{
   p += i;
}

int main(void)
{
   int i;
   int* p = a;

   for (i = 0; i < NN; i++) {
      a[i] = i;
   }

   #pragma omp simd linear(p)
   for (i = 0; i < NN; i++) {
      int& k = *p;
      add_one2(k, i);
      p++;
   }

   for (i = 0; i < NN; i++) {
      if (a[i] != i*2) {
         printf("failed\n");
         return 1;
      }
   }
   printf("passed\n");
   return 0;
}
!!%compiler: gfortran
!!%cflags: -fopenmp

! name: linear_modifier.2
! type: F-free
! version: omp_5.2
module m
   integer, parameter :: NN = 1023
   integer :: a(NN)

 contains
   subroutine add_one2(p, i)
   !$omp declare simd(add_one2) linear(p: ref) linear(i: uval)
   implicit none
   integer :: p
   integer, intent(in) :: i

   p = p + i
   end subroutine
end module

program main
   use m
   implicit none
   integer :: i, p

   do i = 1, NN
      a(i) = i
   end do

   p = 1
   !$omp simd linear(p)
   do i = 1, NN
      call add_one2(a(p), i)
      p = p + 1
   end do

   do i = 1, NN
      if (a(i) /= i*2) then
         print *, "failed"
         stop
      endif
   end do
   print *, "passed"
end program

In the following example, the function func takes arrays x and y as arguments, and accesses the array elements referenced by the index i . The caller loop i in the main program passes a linear copy of the variable k to the function func . The val modifier is used for the linear clause in the declare simd directive for the function func to specify that the argument i is to match the property of the actual argument k passed in the SIMD loop. Arrays x and y have uniform addresses across SIMD lanes.

When linear(i: val,step(1)) is applied to an argument, it tells the compiler that its addresses in the vector argument may not be consecutive, however, their values are linear (with stride 1 here). When the value of i is used in subscript of array references (e.g., x[i] ), the compiler can generate a vector load or store instead of a gather or scatter. This allows more efficient SIMD code to be generated with less source changes.

//%compiler: clang
//%cflags: -fopenmp

/*
* name: linear_modifier.3
* type: C
* version: omp_5.2
*/
#include <stdio.h>

#define N 128

#pragma omp declare simd simdlen(4) uniform(x, y) linear(i:val,step(1))
double func(double x[], double y[], int i)
{
   return (x[i] + y[i]);
}

int main(void)
{
   double x[N], y[N], z1[N], z2;
   int i, k;

   for (i = 0; i < N; i++) {
      x[i] = (double)i;
      y[i] = (double)i*2;
   }

   k = 0;
#pragma omp simd linear(k)
   for (i = 0; i < N; i++) {
      z1[i] = func(x, y, k);
      k++;
   }

   for (i = 0; i < N; i++) {
      z2 = (double)(i + i*2);
      if (z1[i] != z2) {
         printf("failed\n");
         return 1;
      }
   }
   printf("passed\n");
   return 0;
}

!!%compiler: gfortran
!!%cflags: -fopenmp

! name: linear_modifier.3
! type: F-free
! version: omp_5.2
module func_mod
contains
   real(8) function func(x, y, i)
!$omp declare simd(func) simdlen(4) uniform(x, y) linear(i:val,step(1))
      implicit none
      real(8), intent(in) :: x(*), y(*)
      integer, intent(in) :: i

      func = x(i) + y(i)

   end function func
end module func_mod

program main
   use func_mod
   implicit none
   integer, parameter :: n = 128
   real(8) :: x(n), y(n), z1(n), z2
   integer :: i, k

   do i=1, n
      x(i) = real(i, kind=8)
      y(i) = real(i*2, kind=8)
   enddo

   k = 1
!$omp simd linear(k)
   do i=1, n
      z1(i) = func(x, y, k)
      k = k + 1
   enddo

   do i=1, n
      z2 = real(i+i*2, kind=8)
      if (z1(i) /= z2) then
         print *, 'failed'
         stop
      endif
   enddo
   print *, 'passed'
end program main