openmp Loop parallelism in OpenMP


privateComma-separated list of private variables
firstprivateLike private, but initialized to the value of the variable before entering the loop
lastprivateLike private, but the variable will get the value corresponding to the last iteration of the loop upon exit
reductionreduction operator : comma-separated list of corresponding reduction variables
schedulestatic, dynamic, guided, auto or runtime with an optional chunk size after a coma for the 3 former
collapseNumber of perfectly nested loops to collapse and parallelize together
orderedTells that some parts of the loop will need to be kept in-order (these parts will be specifically identified with some ordered clauses inside the loop body)
nowaitRemove the implicit barrier existing by default at the end of the loop construct


The meaning of the schedule clause is as follows:

  • static[,chunk]: Distribute statically (meaning that the distribution is done before entering the loop) the loop iterations in batched of chunk size in a round-robin fashion. If chunk isn't specified, then the chunks are as even as possible and each thread gets at most one of them.
  • dynamic[,chunk]: Distribute the loop iterations among the threads by batches of chunk size with a first-come-first-served policy, until no batch remains. If not specified, chunk is set to 1
  • guided[,chunk]: Like dynamic but with batches which sizes get smaller and smaller, down to 1
  • auto: Let the compiler and/or run time library decide what is best suited
  • runtime: Deffer the decision at run time by mean of the OMP_SCHEDULE environment variable. If at run time the environment variable is not defined, the default scheduling will be used

The default for schedule is implementation define. On many environments it is static, but can also be dynamic or could very well be auto. Therefore, be careful that your implementation doesn't implicitly rely on it without explicitly setting it.

In the above examples, we used the fused form parallel for or parallel do. However, the loop construct can be used without fusing it with the parallel directive, in the form of a #pragma omp for [...] or !$omp do [...] standalone directive within a parallel region.

For the Fortran version only, the loop index variable(s) of the parallized loop(s) is (are) always private by default. There is therefore no need of explicitly declaring them private (although doing so isn't a error).
For the C and C++ version, the loop indexes are just like any other variables. Therefore, if their scope extends outside of the parallelized loop(s) (meaning if they are not declared like for ( int i = ...) but rather like int i; ... for ( i = ... ) then they have to be declared private.