Tutorial by Examples

A data.table is an enhanced version of the data.frame class from base R. As such, its class() attribute is the vector "data.table" "data.frame" and functions that work on a data.frame will also work with a data.table. There are many ways to create, load or coerce to a data.table....
DT[where, select|update|do, by] syntax is used to work with columns of a data.table. The "where" part is the i argument The "select|update|do" part is the j argument These two arguments are usually passed by position instead of by name. Our example data below is mtcars =...
.SD .SD refers to the subset of the data.table for each group, excluding all columns used in by. .SD along with lapply can be used to apply any function to multiple columns by group in a data.table We will continue using the same built-in dataset, mtcars: mtcars = data.table(mtcars) # Let's not ...
Differences in subsetting syntax A data.table is one of several two-dimensional data structures available in R, besides data.frame, matrix and (2D) array. All of these classes use a very similar but not identical syntax for subsetting, the A[rows, cols] schema. Consider the following data stored i...
Yes, you need to SETKEY pre 1.9.6 In the past (pre 1.9.6), your data.table was sped up by setting columns as keys to the table, particularly for large tables. [See intro vignette page 5 of September 2015 version, where speed of search was 544 times better.] You may find older code making use of t...

Page 1 of 1