Tutorial by Examples

Run-length encoding captures the lengths of runs of consecutive elements in a vector. Consider an example vector: dat <- c(1, 2, 2, 2, 3, 1, 4, 4, 1, 1) The rle function extracts each run and its length: r <- rle(dat) r # Run Length Encoding # lengths: int [1:6] 1 3 1 1 2 2 # valu...
One might want to group their data by the runs of a variable and perform some sort of analysis. Consider the following simple dataset: (dat <- data.frame(x = c(1, 1, 2, 2, 2, 1), y = 1:6)) # x y # 1 1 1 # 2 1 2 # 3 2 3 # 4 2 4 # 5 2 5 # 6 1 6 The variable x has three runs: a run of l...
The data.table package provides a convenient way to group by runs in data. Consider the following example data: library(data.table) (DT <- data.table(x = c(1, 1, 2, 2, 2, 1), y = 1:6)) # x y # 1: 1 1 # 2: 1 2 # 3: 2 3 # 4: 2 4 # 5: 2 5 # 6: 1 6 The variable x has three runs: a run ...
Long vectors with long runs of the same value can be significantly compressed by storing them in their run-length encoding (the value of each run and the number of times that value is repeated). As an example, consider a vector of length 10 million with a huge number of 1's and only a small number o...

Page 1 of 1