Structure Mapping
6.4. Structure Mapping#
In the example below, only structure elements S.a , S.b and S.p of the S structure appear in map clauses of a target construct. Only these components have corresponding variables and storage on the device. Hence, the large arrays, S.buffera and S.bufferb , and the S.x component have no storage on the device and cannot be accessed.
Also, since the pointer member S.p is used in an array section of a map clause, the array storage of the array section on the device, S.p[:N] , is attached to the pointer member S.p on the device. Explicitly mapping the pointer member S.p is optional in this case.
Note: The buffer arrays and the x variable have been grouped together, so that the components that will reside on the device are all together (without gaps). This allows the runtime to optimize the transfer and the storage footprint on the device.
//%compiler: clang
//%cflags: -fopenmp
/*
* name: target_struct_map.1
* type: C
* version: omp_5.1
*/
#include <stdio.h>
#include <stdlib.h>
#define N 100
#define BAZILLION 2000000
struct foo {
char buffera[BAZILLION];
char bufferb[BAZILLION];
float x;
float a, b;
float *p;
};
#pragma omp begin declare target
void saxpyfun(struct foo *S)
{
int i;
for(i=0; i<N; i++)
S->p[i] = S->p[i]*S->a + S->b;
}
#pragma omp end declare target
int main()
{
struct foo S;
int i;
S.a = 2.0;
S.b = 4.0;
S.p = (float *)malloc(sizeof(float)*N);
for(i=0; i<N; i++) S.p[i] = i;
#pragma omp target map(alloc:S.p) map(S.p[:N]) map(to:S.a, S.b)
saxpyfun(&S);
printf(" %4.0f %4.0f\n", S.p[0], S.p[N-1]);
// 4 202 <- output
free(S.p);
return 0;
}
The following example is a slight modification of the above example for a C++ class. In the member function SAXPY::driver the array section p[:N] is attached to the pointer member p on the device.
//%compiler: clang
//%cflags: -fopenmp
/*
* name: target_struct_map.2
* type: C++
* version: omp_5.1
*/
#include <cstdio>
#include <cstdlib>
#define N 100
class SAXPY {
private:
float a, b, *p;
public:
float buffer[N];
SAXPY(float arg_a, float arg_b){ a=arg_a; b=arg_b; }
void driver();
void saxpyfun(float *p);
};
#pragma omp begin declare target
void SAXPY::saxpyfun(float *q)
{
for(int i=0; i<N; i++)
buffer[i] = q[i]*a + b;
}
#pragma omp end declare target
void SAXPY::driver()
{
p = (float *) malloc(N*sizeof(float));
for(int i=0; i<N; i++) p[i]=i;
#pragma omp target map(alloc:p) map(to:p[:N]) map(to:a,b) \
map(from:buffer[:N]) // attach(p) to device_malloc()
{
saxpyfun(p);
}
free(p);
}
int main()
{
SAXPY my_saxpy(2.0,4.0);
my_saxpy.driver();
printf(" %4.0f %4.0f\n", my_saxpy.buffer[0], my_saxpy.buffer[N-1]);
// 4 202 <- output
return 0;
}
The next example shows two ways in which the structure may be incorrectly mapped.
In Case 1, the array section S1.p[:N] is first mapped in an enclosing target data construct, and the target construct then implicitly maps the structure S1 . The initial map of the array section does not map the base pointer S1.p – it only maps the elements of the array section. Furthermore, the implicit map is not sufficient to ensure pointer attachment for the structure member S1.p (refer to the conditions for pointer attachment described in Section 6.3). Consequentially, the dereference operation S1.p[i] in the call to saxpyfun will probably fail because S1.p contains a host address.
In Case 2, again an array section is mapped on an enclosing target data construct. This time, the nested target construct explicitly maps S2.p , S2.a , and S2.b . But as in Case 1, this does not satisfy the conditions for pointer attachment since the construct must map a list item for which S2.p is a base pointer, and it must do so when the S2.p is already present on the device or will be created on the device as a result of the same construct.
//%compiler: clang
//%cflags: -fopenmp
/*
* name: target_struct_map.3
* type: C
* version: omp_5.1
*/
#include <stdio.h>
#include <stdlib.h>
#define N 100
#define BAZILLION 2000000
struct foo {
char buffera[BAZILLION];
char bufferb[BAZILLION];
float x;
float a, b;
float *p;
};
#pragma omp begin declare target
void saxpyfun(struct foo *S)
{
int i;
for(i=0; i<N; i++)
S->p[i] = S->p[i] * S->a + S->b; // S->p[i] invalid
}
#pragma omp end declare target
int main()
{
struct foo S1, S2;
int i;
// Case 1
S1.a = 2.0;
S1.b = 4.0;
S1.p = (float *)malloc(sizeof(float)*N);
for(i=0; i<N; i++) S1.p[i] = i;
// No pointer attachment for S1.p here
#pragma omp target data map(S1.p[:N])
#pragma omp target // implicit map of S1
saxpyfun(&S1);
// Case 2
S2.a = 2.0;
S2.b = 4.0;
S2.p = (float *)malloc(sizeof(float)*N);
for(i=0; i<N; i++) S2.p[i] = i;
// No pointer attachment for S2.p here either
#pragma omp target data map(S2.p[:N])
#pragma omp target map(S2.p, S2.a, S2.b) // implicit map of S2
saxpyfun(&S2);
// These print statement may not execute because the
// above code is invalid
printf(" %4.0f %4.0f\n", S1.p[0], S1.p[N-1]);
printf(" %4.0f %4.0f\n", S2.p[0], S2.p[N-1]);
free(S1.p);
free(S2.p);
return 0;
}
The following example correctly implements pointer attachment cases that involve implicit structure maps.
In Case 1, members p , a , and b of the structure S1 are explicitly mapped by the target data construct, to avoid mapping parts of S1 that aren’t required on the device. The mapped S1.p is attached to the array section S1.p[:N] , and remains attached while it exists on the device (for the duration of target data region). Due to the S1 reference inside the nested target construct, the construct implicitly maps S1 so that the reference refers to the corresponding storage created by the enclosing target data region. Note that only the members a , b , and p may be accessed from this storage.
In Case 2, only the storage for the array section S2.p[:N] is mapped by the target data construct. The nested target construct explicitly maps S2.a and S2.b and explicitly maps an array section for which S2.p is a base pointer. This satisfies the conditions for S2.p becoming an attached pointer. The array section in this case is zero-length, but the effect would be the same if the length was a positive integer less than or equal to N . There is also an implicit map of the containing structure S2 , again due to the reference to S2 inside the construct. The effect of this implicit map permits access only to members a , b , and p , as for Case 1.
In Case 3, there is no target data construct. The target construct explicitly maps S3.a and S3.b and explicitly maps an array section for which S3.p is a base pointer. Again, there is an implicit map of the structure referenced in the construct, S3 . This implicit map also causes S3.p to be implicitly mapped, because no other part of S3 is present prior to the construct being encountered. The result is an attached pointer S3.p on the device. As for Cases 1 and 2, this implicit map only ensures that storage for the members a , b , and p are accessible within the corresponding S3 that is created on the device.
//%compiler: clang
//%cflags: -fopenmp
/*
* name: target_struct_map.4
* type: C
* version: omp_5.1
*/
#include <stdio.h>
#include <stdlib.h>
#define N 100
#define BAZILLION 2000000
struct foo {
char buffera[BAZILLION];
char bufferb[BAZILLION];
float x;
float a, b;
float *p;
};
#pragma omp begin declare target
void saxpyfun(struct foo *S)
{
int i;
for(i=0; i<N; i++)
S->p[i] = S->p[i]*S->a + S->b;
}
#pragma omp end declare target
int main()
{
struct foo S1, S2, S3;
int i;
// Case 1
S1.a = 2.0;
S1.b = 4.0;
S1.p = (float *)malloc(sizeof(float)*N);
for(i=0; i<N; i++) S1.p[i] = i;
// The target data construct results in pointer attachment for S1.p.
// Explicitly mapping S1.p, S1.a, and S1.b rather than S1 avoids
// mapping the entire structure (including members buffera, bufferb,
// and x).
#pragma omp target data map(S1.p[:N],S1.p,S1.a,S1.b)
#pragma omp target //implicit map of S1
saxpyfun(&S1);
// Case 2
S2.a = 2.0;
S2.b = 4.0;
S2.p = (float *)malloc(sizeof(float)*N);
for(i=0; i<N; i++) S2.p[i] = i;
// The target construct results in pointer attachment for S2.p.
#pragma omp target data map(S2.p[:N])
#pragma omp target map(S2.p[:0], S2.a, S2.b) // implicit map of S2
saxpyfun(&S2);
// Case 3
S3.a = 2.0;
S3.b = 4.0;
S3.p = (float *)malloc(sizeof(float)*N);
for(i=0; i<N; i++) S3.p[i] = i;
// The target construct results in pointer attachment for S3.p.
// Note that S3.p is implicitly mapped due to the implicit map of S3
// (but corresponding storage is NOT created for members buffera,
// bufferb, and x).
#pragma omp target map(S3.p[:N], S3.a, S3.b) // implicit map of S3
saxpyfun(&S3);
printf(" %4.0f %4.0f\n", S1.p[0], S1.p[N-1]); //OUT1 4 202
printf(" %4.0f %4.0f\n", S2.p[0], S2.p[N-1]); //OUT2 4 202
printf(" %4.0f %4.0f\n", S3.p[0], S3.p[N-1]); //OUT3 4 202
free(S1.p);
free(S2.p);
free(S3.p);
return 0;
}