Julia Language Higher-Order Functions Map, filter, and reduce


Example

Two of the most fundamental higher-order functions included in the standard library are map and filter. These functions are generic and can operate on any iterable. In particular, they are well-suited for computations on arrays.

Suppose we have a dataset of schools. Each school teaches a particular subject, has a number of classes, and an average number of students per class. We can model a school with the following immutable type:

immutable School
    subject::Symbol
    nclasses::Int
    nstudents::Int  # average no. of students per class
end

Our dataset of schools will be a Vector{School}:

dataset = [School(:math, 3, 30), School(:math, 5, 20), School(:science, 10, 5)]

Suppose we wish to find the number of students in total enrolled in a math program. To do this, we require several steps:

  • we must narrow the dataset down to only schools that teach math (filter)
  • we must compute the number of students at each school (map)
  • and we must reduce that list of numbers of students to a single value, the sum (reduce)

A naïve (not most performant) solution would simply be to use those three higher-order functions directly.

function nmath(data)
    maths = filter(x -> x.subject === :math, data)
    students = map(x -> x.nclasses * x.nstudents, maths)
    reduce(+, 0, students)
end

and we verify there are 190 math students in our dataset:

julia> nmath(dataset)
190

Functions exist to combine these functions and thus improve performance. For instance, we could have used the mapreduce function to perform the mapping and reduction in one step, which would save time and memory.

The reduce is only meaningful for associative operations like +, but occasionally it is useful to perform a reduction with a non-associative operation. The higher-order functions foldl and foldr are provided to force a particular reduction order.