data.table
offers a wide range of possibilities to reshape your data both efficiently and easily
For instance, while reshaping from long to wide you can both pass several variables into the value.var
and into the fun.aggregate
parameters at the same time
library(data.table) #v>=1.9.6
DT <- data.table(mtcars)
Long to wide
dcast(DT, gear ~ cyl, value.var = c("disp", "hp"), fun = list(mean, sum))
gear disp_mean_4 disp_mean_6 disp_mean_8 hp_mean_4 hp_mean_6 hp_mean_8 disp_sum_4 disp_sum_6 disp_sum_8 hp_sum_4 hp_sum_6 hp_sum_8
1: 3 120.100 241.5 357.6167 97 107.5 194.1667 120.1 483.0 4291.4 97 215 2330
2: 4 102.625 163.8 NaN 76 116.5 NaN 821.0 655.2 0.0 608 466 0
3: 5 107.700 145.0 326.0000 102 175.0 299.5000 215.4 145.0 652.0 204 175 599
This will set gear
as the index column, while mean
and sum
will be calculated for disp
and hp
for every gear
and cyl
combination. In case some combinations don't exist you could specify additional parameters such as na.rm = TRUE
(which will be passed to mean
and sum
functions) or specify the builtin fill
argument. You can also add margins, drop missing combinations and subset the data. See more in ?data.table::dcast
Wide to long
While reshaping from wide to long, you can pass columns to the measure.vars
parameter using regular expressions, for instance
print(melt(DT, c("cyl", "gear"), measure = patterns("^d", "e")), n = 10)
cyl gear variable value1 value2
1: 6 4 1 160.00 16.46
2: 6 4 1 160.00 17.02
3: 4 4 1 108.00 18.61
4: 6 3 1 258.00 19.44
5: 8 3 1 360.00 17.02
---
60: 4 5 2 3.77 5.00
61: 8 5 2 4.22 5.00
62: 6 5 2 3.62 5.00
63: 8 5 2 3.54 5.00
64: 4 4 2 4.11 4.00
This will melt
the data by cyl
and gear
as the index columns, while all the values for the variables that begin with d
(disp
& drat
) will be present in value1
and the values for the variables that contain the letter e
in them (qsec
and gear
) will be present in the value2
column.
You can also rename all the column names in the result while specifying variable.name
and value.name
arguments or decide if you want the character
columns to be automatically converted to factor
s or not while specifying variable.factor
and value.factor
arguments. See more in ?data.table::melt