Data.table is a package that extends the functionality of data frames from base R, particularly improving on their performance and syntax. See the package's Docs area at Getting started with data.table for details.

`DT[i, j, by]`

# DT[where, select|update|do, by]`DT[...][...]`

# chaining`################# Shortcuts, special functions and special symbols inside DT[...]`

- .()

# in several arguments, replaces list() - J()

# in i, replaces list() - :=

# in j, a function used to add or modify columns - .N

# in i, the total number of rows

# in j, the number of rows in a group - .I

# in j, the vector of row numbers in the table (filtered by i) - .SD

# in j, the current subset of the data

# selected by the .SDcols argument - .GRP

# in j, the current index of the subset of the data - .BY

# in j, the list of by values for the current subset of data - V1, V2, ...

# default names for unnamed columns created in j `################# Joins inside DT[...]`

- DT1[DT2, on, j]

# join two tables - i.*

# special prefix on DT2's columns after the join - by=.EACHI

# special option available only with a join - DT1[!DT2, on, j]

# anti-join two tables - DT1[DT2, on, roll, j]

# join two tables, rolling on the last column in on= `################# Reshaping, stacking and splitting`

- melt(DT, id.vars, measure.vars)

# transform to long format

# for multiple columns, use measure.vars = patterns(...) - dcast(DT, formula)

# transform to wide format - rbind(DT1, DT2, ...)

# stack enumerated data.tables - rbindlist(DT_list, idcol)

# stack a list of data.tables - split(DT, by)

# split a data.table into a list `################# Some other functions specialized for data.tables`

- foverlaps

# overlap joins - merge

# another way of joining two tables - set

# another way of adding or modifying columns - fintersect, fsetdiff, funion, fsetequal, unique, duplicated, anyDuplicated

# set-theory operations with rows as elements - uniqueN

# the number of distinct rows - rowidv(DT, cols)

# row ID (1 to .N) within each group determined by cols - rleidv(DT, cols)

# group ID (1 to .GRP) within each group determined by runs of cols - shift(DT, n, type=c("lag", "lead"))

# apply a shift operator to every column - setorder, setcolorder, setnames, setkey, setindex, setattr

# modify attributes and order by reference

To install the data.table package:

```
# install from CRAN
install.packages("data.table")
# or install development version
install.packages("data.table", type = "source", repos = "http://Rdatatable.github.io/data.table")
# and to revert from devel to CRAN, the current version must first be removed
remove.packages("data.table")
install.packages("data.table")
```

The package's official site has wiki pages providing help getting started, and lists of presentations and articles from around the web. Before asking a question -- here on StackOverflow or anywhere else -- please read the support page.

Many of the functions in the examples above exist in the data.table namespace. To use them, you will need to add a line like `library(data.table)`

first or to use their full path, like `data.table::fread`

instead of simply `fread`

. For help on individual functions, the syntax is `help("fread")`

or `?fread`

. Again, if the package is not loaded, use the full name like `?data.table::fread`

.