A data.table is an enhanced version of the data.frame class from base R. As such, its class()
attribute is the vector "data.table" "data.frame"
and functions that work on a data.frame will also work with a data.table. There are many ways to create, load or coerce to a data.table.
Don't forget to install and activate the data.table
package
library(data.table)
There is a constructor of the same name:
DT <- data.table(
x = letters[1:5],
y = 1:5,
z = (1:5) > 3
)
# x y z
# 1: a 1 FALSE
# 2: b 2 FALSE
# 3: c 3 FALSE
# 4: d 4 TRUE
# 5: e 5 TRUE
Unlike data.frame
, data.table
will not coerce strings to factors:
sapply(DT, class)
# x y z
# "character" "integer" "logical"
We can read from a text file:
dt <- fread("my_file.csv")
Unlike read.csv
, fread
will read strings as strings, not as factors.
For efficiency, data.table offers a way of altering a data.frame or list to make a data.table in-place (without making a copy or changing its memory location):
# example data.frame
DF <- data.frame(x = letters[1:5], y = 1:5, z = (1:5) > 3)
# modification
setDT(DF)
Note that we do not <-
assign the result, since the object DF
has been modified in-place. The class attributes of the data.frame will be retained:
sapply(DF, class)
# x y z
# "factor" "integer" "logical"
If you have a list
, data.frame
, or data.table
, you should use the setDT
function to convert to a data.table
because it does the conversion by reference instead of making a copy (which as.data.table
does). This is important if you are working with large datasets.
If you have another R object (such as a matrix), you must use as.data.table
to coerce it to a data.table
.
mat <- matrix(0, ncol = 10, nrow = 10)
DT <- as.data.table(mat)
# or
DT <- data.table(mat)