@parallel can be used to parallellize a loop, dividing steps of the loop up over different workers. As a very simple example:
addprocs(3)
a = collect(1:10)
for idx = 1:10
println(a[idx])
end
For a slightly more complex example, consider:
@time begin
@sync begin
@parallel for idx in 1:length(a)
sleep(a[idx])
end
end
end
27.023411 seconds (13.48 k allocations: 762.532 KB)
julia> sum(a)
55
Thus, we see that if we had executed this loop without @parallel
it would have taken 55 seconds, rather than 27, to execute.
We can also supply a reduction operator for the @parallel
macro. Suppose we have an array, we want to sum each column of the array and then multiply these sums by each other:
A = rand(100,100);
@parallel (*) for idx = 1:size(A,1)
sum(A[:,idx])
end
There are several important things to keep in mind when using @parallel
to avoid unexpected behavior.
First: if you want to use any functions in your loops that are not in base Julia (e.g. either functions you define in your script or that you import from packages), then you must make those functions accessible to the workers. Thus, for example, the following would not work:
myprint(x) = println(x)
for idx = 1:10
myprint(a[idx])
end
Instead, we would need to use:
@everywhere begin
function myprint(x)
println(x)
end
end
@parallel for idx in 1:length(a)
myprint(a[idx])
end
Second Although each worker will be able to access the objects in the scope of the controller, they will not be able to modify them. Thus
a = collect(1:10)
@parallel for idx = 1:length(a)
a[idx] += 1
end
julia> a'
1x10 Array{Int64,2}:
1 2 3 4 5 6 7 8 9 10
Whereas, if we had executed the loop wihtout the @parallel it would have successfully modified the array a
.
TO ADDRESS THIS, we can instead make a
a SharedArray
type object so that each worker can access and modify it:
a = convert(SharedArray{Float64,1}, collect(1:10))
@parallel for idx = 1:length(a)
a[idx] += 1
end
julia> a'
1x10 Array{Float64,2}:
2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0