R Language Parallel processing Parallel processing with foreach package


The foreach package brings the power of parallel processing to R. But before you want to use multi core CPUs you have to assign a multi core cluster. The doSNOW package is one possibility.

A simple use of the foreach loop is to calculate the sum of the square root and the square of all numbers from 1 to 100000.


cl <- makeCluster(5, type = "SOCK")

f <- foreach(i = 1:100000, .combine = c, .inorder = F) %dopar% {
    k <- i ** 2 + sqrt(i)

The structure of the output of foreach is controlled by the .combine argument. The default output structure is a list. In the code above, c is used to return a vector instead. Note that a calculation function (or operator) such as "+" may also be used to perform a calculation and return a further processed object.

It is important to mention that the result of each foreach-loop is the last call. Thus, in this example k will be added to the result.

.combinecombine Function. Determines how the results of the loop are combined. Possible values are c, cbind, rbind, "+", "*"...
.inorderif TRUE the result is ordered according to the order of the iteration vairable (here i). If FALSE the result is not ordered. This can have postive effects on computation time.
.packagesfor functions which are provided by any package except base, like e.g. mass, randomForest or else, you have to provide these packages with c("mass", "randomForest")