In this assignment, you will parallelize three programs using OpenMP, Jacobi iterative method, image Histogram and Filtering. For Jacobi, the sequential version and the program skeleton are given as jacobi.c file. For image Histogram and Filtering (convolution), you can start with the sequential implementation you finished in Assignment 1, or using the files provided in this assignment. For Filtering, your implementation provides OpenMP parallelization of processing a single image. You have option to enhance your filtering implementation to support processing multiple images in pipeline (see below description), which would earn you bonus point. Please check course lecture Dense Matrices and Decomposition for the description of the parallelization algorithms.
OpenMP omp parallel
should be applied to the outer while
loop and omp for
to the two inner for
loops of the jacobi_omp
function. reduction
and single
may be needed to make sure output and error are computed correctly.
OpenMP parallel for
should be used for parallelizing the loop for computing the histogram. The difficult part is for storing the histogram since data racing may be introduced for updating histogram by multiple threads. We talked about two options of parallelization during the class and you only need to implement one of them.
Your implementation provides OpenMP parallelization of processing a single image by using OpenMP parallel for
for the filtering loop.
You have option to enhance your program for supporting processing multiple images using the pipeline algorithm to overlap I/O and computations as we discussed during the class (slide #86 of OpenMP). The algorithm is also shown in the following picture. Please refer to OpenCV for reading and writing images. You program should also be enhanced to take as arguments the name of a folder that contains the images and the number of images to be processed. To simplify reading and writing images, all input images are named sequentially as 1.jpg, 2.jpg, 3.jpg, …, and out images should be named as 1out.jpg, 2out.jpg, 3out.jpg, …. Image files are provided in data folder (about 50 images). You will earn bonus point by doing this.
You can start your implementation with your codes from Assignment 1. For Jacobi, copy jacobi.c file to the folder and add two lines add_executable( jacobi jacobi.c )
and target_link_libraries( jacobi m )
in the CMakeLists.txt file so you will build jacobi using the same instruction as in the README.md file. A sample CMakeLists.txt is given for your reference if you need. For image Histogram and Filtering (convolution), you can also start with the implementation in Histogram.cpp and Filtering.cpp if you do not want to use yours.
Your development can be done in any machine that has the necessnary tools and environment setup (mainly make, cmake, gcc compiler, CUDA for GPU, and OpenCV), including those in Swearingen 1D39 and 3D22 for OpenMP and MPI development. In this assingment, you are encouraged to report your results using Bridges supercomputer from PSC, which will earn you 10 points as bonus. In the future, reporting results using Bridges is required. Please check Access to XSEDE PSC Bridges Supercomputer section for details. Each node of Bridges has 16 cores supporting up to 32 hardware threads. You can use export OMP_NUM_THREADS=8
to set the number of threads for your OpenMP program execution.
Your submission should be a single zipped file named LastNameFirstName.zip that includes ONLY the following: three source code files (jacobi.c, Histogram.cpp, Filtering.cpp), a CMakeLists.txt file for how to build sources to executables, one PDF file for your report. Please remove all other files, including the executables, Excel sheet, etc. The source files contain your implementations, and each file should be invidually compiled to generate executables. The report is max 3-page report that includes:
x
threads, will be automatically calculated based on the numbers in executime time table. The two figures will be automatically populated and generated by Excel as well. Please include those figures in your report.cat /proc/cpuinfo
command.