cuda Tutorial => Parallel reduction (e.g. how to sum an array)

Remarks

Parallel reduction algorithm typically refers to an algorithm which combines an array of elements, producing a single result. Typical problems that fall into this category are:

summing up all elements in an array
finding a maximum in an array

In general, the parallel reduction can be applied for any binary associative operator, i.e. (A*B)*C = A*(B*C). With such operator *, the parallel reduction algorithm repetedely groups the array arguments in pairs. Each pair is computed in parallel with others, halving the overall array size in one step. The process is repeated until only a single element exists.

If the operator is commutative (i.e. A*B = B*A) in addition to being associative, the algorithm can pair in a different pattern. From theoretical standpoint it makes no difference, but in practice it gives a better memory access pattern:

Not all associative operators are commutative - take matrix multiplication for example.

PDF - Download cuda for free

Previous Next

cuda

Fastest Entity Framework Extensions

Remarks

Got any cuda Question?

cuda

cuda Parallel reduction (e.g. how to sum an array)

Fastest Entity Framework Extensions

Remarks

Parallel reduction (e.g. how to sum an array) Related Examples

Got any cuda Question?