1.2. Creating a Parallel Program with OpenMP#

In this section, we will introduce the syntax of OpenMP, how to compile OpenMP programs, and give the readers an overall picture of OpenMP programs through two simple examples.

1.2.1. How to compile OpenMP programs?#

When compiling OpenMP programs, you need to use compiler flags to enable OpenMP, such as –openmp, -xopenmp, -fopenmp, -mp

In this book, all our examples are compiled using LLVM/Clang on Ubuntu 20.04. LLVM/Clang has the advantages of fast compilation speed, less memory usage, and modular design. To execute OpenMP code, you need to add -fopenmp when compiling. The full command to compile is

clang -fopenmp filename.c -o filename.o

It is also worth mentioning that when writing OpenMP programs, you need to include the <omp.h> header file.

1.2.2. OpenMP Directives and Syntax#

A series of directives and clauses in OpenMP identify code blocks as parallel regions. Programmers only need to insert these directives into the code, so OpenMP is defined as a kind of directive_based language. In C/C++, the directive is based on the #pragma omp construct. In Fortran, instructions begin with ! $omp. This is a regular language pragma (in C/C++) or a regular comment (in Fortran) for compilers. So special option is need when the compiler is required to generate OpenMP code. Otherwise the compiler won’t recognize it and simply ignore it.

The basic format of OpenMP directive in C/C++ is as follows:

#pragma omp directive-name [clause[ [,] clause]...]

In fortran, the directives take one of the forms:

Fixed forms:

*$OMP directive-name [clause[ [,] clause]...]
C$OMP directive-name [clause[ [,] clause]...]

Free form (but works for fixed form too):

!$omp directive-name [clause[ [,] clause]...]

Where ‘[]’ means optional. A directive acts on the statement immediately following it or a block of statements enclosed by ‘{}’. Common diretcives are parallel, for, sections, single, atomic, barrier, simd, target, etc. Clause is equivalent to the modification of directive, which can be used to specify additional information with the directive. The specific clause(s) that can be used, depends on the directive.

In Fortran, OpenMP directives specify a paired end directive, where the directive-name of the paired end directives is:

• If directive-name starts with begin, the end-directive-name replaces begin with end

• otherwise it is end directive-name unless otherwise specified.

1.2.3. OpenMP Parallel Regions#

As we have mentioned in the previous section, parallel region is a block of code executed by all threads in a team simultaneously. A block is a logically connected group of program statements considered as a unit. Next, let’s take a look at how code blocks are defined in C/C++ and fortran. In C/C++, a block is a single statement or a group of statement between opening and closing curly braces. for example:

#pragma omp parallel
{
      id = omp_get_thread_num();
      res[id] = lots_of_work(id);
}

The two statements between the curly braces are a logical unit where one statement cannot be executed without the other being executed. They form a code block.

Another example:

#pragma omp parallel for
for(i=0;i<N;i++) {
       res[i] = big_calc(i);
       A[i] = B[i] + res[i];
  }

Same with the above example, the two statements between the curly braces are a code block.

In Fortran, a block is a single statement or a group of statements between directive/end-directive pairs. For example:

!$OMP PARALLEL
10    wrk(id) = garbage(id)
      res(id) = wrk(id)**2
      if(.not.conv(res(id))) goto 10
!$OMP END PARALLEL

Another example:

!$OMP PARALLEL DO
    do i=1,N
        res(i)=bigComp(i)
    end do 
!$OMP END PARALLEL DO

In fortran, OpenMP directives specify a paired end directive. Therefore, in the above two examples, the statements between the OpenMP directive and the corresponding end directive form a code block, which is also a parallel region.

1.2.4. Creating a simple OpenMP Program#

This simple OpenMP program execute in parallel using multiple threads, and each thread outputs a ‘Hello World’.

//%compiler: clang
//%cflags: -fopenmp

#include <stdio.h>
#include <stdlib.h>
#include <omp.h>

int main(int argc, char *argv[]){
    #pragma omp parallel
    printf("%s\n", "Hello World");
    
    return 0;
}
Hello World
Hello World
Hello World
Hello World
Hello World
Hello World
Hello World
Hello World

“#pragma omp parallel” indicates that the subsequent statement will be executed by multiple threads in parallel, and the number of threads is preset by the system (generally equal to the number of logical processors, for example, i5 4-core 8-thread CPU has 8 logical processors),

If we want to specify the number of threads, we can add optional clauses to this directive, such as “#pragma omp parallel num_threads(4)” still means that the subsequent statement will be executed by multiple threads in parallel, but the number of threads is 4 .

//%compiler: clang
//%cflags: -fopenmp

#include <stdio.h>
#include <stdlib.h>
#include <omp.h>

int main(int argc, char *argv[]){
    #pragma omp parallel num_threads(4)
    printf("%s\n", "Hello World");
    
    return 0;
}
Hello World
Hello World
Hello World
Hello World

Through these two simple examples, I believe the readers already understood the basic structure of an OpenMP program. Next, we will introduce the factors that affect the performance of the OpenMP program.