R Language Tutorial => data.table

Introduction

Data.table is a package that extends the functionality of data frames from base R, particularly improving on their performance and syntax. See the package's Docs area at Getting started with data.table for details.

Syntax

DT[i, j, by]
# DT[where, select|update|do, by]
DT[...][...]
# chaining
################# Shortcuts, special functions and special symbols inside DT[...]
.()
# in several arguments, replaces list()
J()
# in i, replaces list()
:=
# in j, a function used to add or modify columns
.N
# in i, the total number of rows
# in j, the number of rows in a group
.I
# in j, the vector of row numbers in the table (filtered by i)
.SD
# in j, the current subset of the data
# selected by the .SDcols argument
.GRP
# in j, the current index of the subset of the data
.BY
# in j, the list of by values for the current subset of data
V1, V2, ...
# default names for unnamed columns created in j
################# Joins inside DT[...]
DT1[DT2, on, j]
# join two tables
i.*
# special prefix on DT2's columns after the join
by=.EACHI
# special option available only with a join
DT1[!DT2, on, j]
# anti-join two tables
DT1[DT2, on, roll, j]
# join two tables, rolling on the last column in on=
################# Reshaping, stacking and splitting
melt(DT, id.vars, measure.vars)
# transform to long format
# for multiple columns, use measure.vars = patterns(...)
dcast(DT, formula)
# transform to wide format
rbind(DT1, DT2, ...)
# stack enumerated data.tables
rbindlist(DT_list, idcol)
# stack a list of data.tables
split(DT, by)
# split a data.table into a list
################# Some other functions specialized for data.tables
foverlaps
# overlap joins
merge
# another way of joining two tables
set
# another way of adding or modifying columns
fintersect, fsetdiff, funion, fsetequal, unique, duplicated, anyDuplicated
# set-theory operations with rows as elements
uniqueN
# the number of distinct rows
rowidv(DT, cols)
# row ID (1 to .N) within each group determined by cols
rleidv(DT, cols)
# group ID (1 to .GRP) within each group determined by runs of cols
shift(DT, n, type=c("lag", "lead"))
# apply a shift operator to every column
setorder, setcolorder, setnames, setkey, setindex, setattr
# modify attributes and order by reference

Remarks

Installation and support

To install the data.table package:

# install from CRAN
install.packages("data.table")       

# or install development version 
install.packages("data.table", type = "source", repos = "http://Rdatatable.github.io/data.table")

# and to revert from devel to CRAN, the current version must first be removed
remove.packages("data.table")
install.packages("data.table")

The package's official site has wiki pages providing help getting started, and lists of presentations and articles from around the web. Before asking a question -- here on StackOverflow or anywhere else -- please read the support page.

Loading the package

Many of the functions in the examples above exist in the data.table namespace. To use them, you will need to add a line like library(data.table) first or to use their full path, like data.table::fread instead of simply fread. For help on individual functions, the syntax is help("fread") or ?fread. Again, if the package is not loaded, use the full name like ?data.table::fread.

PDF - Download R Language for free

Previous Next

R Language

Fastest Entity Framework Extensions

Introduction

Syntax

Remarks

Installation and support

Loading the package

Got any R Language Question?

R Language

R Language data.table

Fastest Entity Framework Extensions

Introduction

Syntax

Remarks

Installation and support

Loading the package

data.table Related Examples

Got any R Language Question?