Shashin Error:

Invalid data property __get for fancyboxLoadScript
rhl » Archive » multi-threading with openmp

moore’s law is dead. at least that is what we keep hearing. personally, i think that while it might take time there will be talented engineers who will figure out a way to keep moore’s law going. in the meantime we programmers need something to speed up our codes. that something is parallel programming.

most likely the machine you are using to read this blog post has a multi-core processor in it. the idea is that your machine is capable of executing more than one instruction per clock tick. i.e if you have a total of four cores, then in theory, the programmer could make some computer programs run four times faster.

one way a programmer can take advantage of a multi-core machine is a language known as OpenMP or ‘open multi-processing.’

let’s start with a simple familiar example: `hello world.`

$ cat main.cpp

#include<iostream>
int main(int argc, char** argv)
{
std::cout << "Hello, World!" << std::endl;
}

$ gcc main.cpp -lstdc++

$ ./a.out
Hello, World!

The OpenMP API provides a number of preprocessor directives for creating threads. A directive is an action and the greek word for action is ‘pragma.’ thus to give a directive to the preprocessor we write:

#pragma omp [directive content here]

The first program we will demonstrate is the `parallel` directive. This directive creates a number of threads which execute any instruction in the parallel region:

#include<iostream>
int main(int argc, char** argv)
{
	#pragma omp parallel
	std::cout << "Hello, World!" << std::endl;
}

Now in order to have the pre-processor actually process OpenMP directives, we need to enable the openmp flag in gcc: `-fopenmp`

$ gcc main.cpp -fopenmp -lstdc++

And now when we run our binary we see something like:

$ ./a.out
Hello, World!
Hello, World!
Hello, World!

And now we have written out first OpenMP program!
Although if you run the program a few more times you will see:

$ ./a.out
Hello, World!Hello, World!

$ ./a.out
Hello, World!Hello, World!

$ ./a.out
Hello, World!
Hello, World!
$ ./a.out
Hello, World!Hello, World!

Notice how each line of our print statement executes in an arbitrary order! This is because each thread executes it’s print statement at a different time, and since there is no control which thread may execute this code, we sometimes will get unexpected results.

To correct this we need another directive, the ‘critical section.’ This directive tells the pre-processor than any code which follows may only be executed by one processor at a time.

#include<iostream>
int main(int argc, char** argv)
{
  #pragma omp parallel
  {
     //All code in this block is in the parallel region
     #pragma omp critical
     {
        //only one thread is able to execute code
        //in a 'critical' section at a time.
        std::cout << "Hello, World!" << std::endl;
     }
   }
}

And now when we compile and run we will always get a similar result:

$ ./a.out
Hello, World!
Hello, World!
$ ./a.out
Hello, World!
Hello, World!
$ ./a.out
Hello, World!
Hello, World!
$ ./a.out
Hello, World!
Hello, World!

In addition to pre-compiler directives. The OpenMP API also provides a number of functions for parallel flow control.omp_get_thread_num() for example, gives us an integer 0,..,N-1 corresponding to which thread is executing the function.
Consider:

#include<iostream>
#include <omp.h>
int main(int argc, char** argv)
{
  #pragma omp parallel
  {
     //All code in this block is in the parallel region
     #pragma omp critical
     {
        //only one thread is able to execute code
        //in a 'critical' section at a time.
        std::cout << "Hello, World: From Thread "
                  << omp_get_thread_num()
                  << std::endl;
     }
   }
}

which executes:

$ ./a.out
Hello, World: From Thread 0
Hello, World: From Thread 1

Sorry, the comment form is closed at this time.