R Language Run-length encoding


A run is a consecutive sequence of repeated values or observations. For repeated values, R's "run-length encoding" concisely describes a vector in terms of its runs. Consider:

dat <- c(1, 2, 2, 2, 3, 1, 4, 4, 1, 1)

We have a length-one run of 1s; then a length-three run of 2s; then a length-one run of 3s; and so on. R's run-length encoding captures all the lengths and values of a vector's runs.


A run can also refer to consecutive observations in a tabular data. While R doesn't have a natural way of encoding these, they can be handled with rleid from the data.table package (currently a dead-end link).