diego domenzain
March 2021 @ Colorado School of Mines
Communicating with multiple processors (MP) using the openMP framework.
These functions are examples on how to use openMP.
They are taken from the book The openMP common core. The book itself comes with code in C and Fortran. The code can be found here and here.
This repo is just a way for me to learn this stuff. The code is to be compiled and run here in a simple way.
For C,
gcc -fopenmp file.c
./a.out
For Fortran,
gfortran -fopenmp file.f90
./a.out
My old Mac doesn't have openMP enabled for C, so the C implementation might not be complete.
Fortran is doing ok though.
🏡 windows parenthesis
Get ifort
set up:
cmd.exe "/K" '"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" && powershell'
compile with openmp
& run:
ifort /Qopenmp integral_parallel_block.f90
.\integral_parallel_block.exe
- Concurrency: if you don't schedule right, the result will be scrambled. Solutions,
- critical sections: forces a block of code to be executed by only one thread at a time.
- barriers: explicitly force all threads to wait until all threads have finished.
- Plagues of parallel programing,
- data racing:
- two or more threads in a shared memory sytem issue loads and stores to overlapping address ranges,
- those loads and stores are not constrained to follow a well defined order.
- false sharing: cache lines have to move back and forth between cores because the object being accessed by different cores are close in cache.
- data racing:
- GPUs prioritize throughput rather than latency,
- good for algorithms whose "workers" need little data interaction.
- CPUs prioritize latency rather than throughput.
- Optimize code by
- reorganizing loops to reuse data from cache lines (cache blocking),
- initialize data on the same cores that will later process that data.
- MPI between nodes, and openMP within a node.
- openMP makes
pthread.h
simple. - Two different architectures,
- Symmetric Multiprocessor (SMP).
- Non-uniform memory architecture (NUMA).
- Simple program multiple data (SPMD) design pattern:
- Launch two or more threads that execute the same code.
- A team is a set of threads.
- Each thread has its own ID, and knows the total number of threads in the team.
- Use the ID and the number of threads in the team to split up the work between threads.
- Parallelization can be made cyclically or by blocks.