R Language data.table Creating a data.table


A data.table is an enhanced version of the data.frame class from base R. As such, its class() attribute is the vector "data.table" "data.frame" and functions that work on a data.frame will also work with a data.table. There are many ways to create, load or coerce to a data.table.


Don't forget to install and activate the data.table package


There is a constructor of the same name:

DT <- data.table(
  x = letters[1:5], 
  y = 1:5, 
  z = (1:5) > 3
#    x y     z
# 1: a 1 FALSE
# 2: b 2 FALSE
# 3: c 3 FALSE
# 4: d 4  TRUE
# 5: e 5  TRUE

Unlike data.frame, data.table will not coerce strings to factors:

sapply(DT, class)
#               x           y           z 
#     "character"   "integer"   "logical" 

Read in

We can read from a text file:

dt <- fread("my_file.csv")

Unlike read.csv, fread will read strings as strings, not as factors.

Modify a data.frame

For efficiency, data.table offers a way of altering a data.frame or list to make a data.table in-place (without making a copy or changing its memory location):

# example data.frame
DF <- data.frame(x = letters[1:5], y = 1:5, z = (1:5) > 3)
# modification

Note that we do not <- assign the result, since the object DF has been modified in-place. The class attributes of the data.frame will be retained:

sapply(DF, class)
#         x         y         z 
#  "factor" "integer" "logical" 

Coerce object to data.table

If you have a list, data.frame, or data.table, you should use the setDT function to convert to a data.table because it does the conversion by reference instead of making a copy (which as.data.table does). This is important if you are working with large datasets.

If you have another R object (such as a matrix), you must use as.data.table to coerce it to a data.table.

mat <- matrix(0, ncol = 10, nrow = 10)

DT <- as.data.table(mat)
# or
DT <- data.table(mat)