int i;
int n = 1000000;
double area = 0;
double h = 1.0 / n;
#pragma omp parallel shared(n, h)
{
double thread_area = 0; // Private / local variable
#pragma omp for
for (i = 1; i <= n; i++)
{
double x = h * (i - 0.5);
thread_area += (4.0 / (1.0 + x*x));
}
#pragma omp atomic // Applies the reduction manually
area += thread_area; // All threads aggregate into area
}
double pi = h * area;
The threads are spawned in the #pragma omp parallel. Each thread will have an independent/private thread_area that stores its partial addition. The following loop is distributed among threads using #pragma omp for. In this loop, each thread calculates its own thread_area and after this loop, the code sequentially aggregates the area atomically through