Help us to keep this website almost Ad Free! It takes only 10 seconds of your time:

```
# example data
DT = data.table(iris)
DT[, Bin := cut(Sepal.Length, c(4,6,8))]
```

Suppose we want the `summary`

function output for `Sepal.Length`

along with the number of observations:

```
DT[, c(
as.list(summary(Sepal.Length)),
N = .N
), by=.(Species, Bin)]
# Species Bin Min. 1st Qu. Median Mean 3rd Qu. Max. N
# 1: setosa (4,6] 4.3 4.8 5.0 5.006 5.2 5.8 50
# 2: versicolor (6,8] 6.1 6.2 6.4 6.450 6.7 7.0 20
# 3: versicolor (4,6] 4.9 5.5 5.6 5.593 5.8 6.0 30
# 4: virginica (6,8] 6.1 6.4 6.7 6.778 7.2 7.9 41
# 5: virginica (4,6] 4.9 5.7 5.8 5.722 5.9 6.0 9
```

We have to make `j`

a list of columns. Usually, some playing around with `c`

, `as.list`

and `.`

is enough to figure out the correct way to proceed.

Instead of making a summary table, we may want to store a summary statistic in a new column. We can use `:=`

as usual. For example,

```
DT[, is_big := .N >= 25, by=.(Species, Bin)]
```

If you find yourself wanting to parse column names, like

Take the mean of

`x.Length/x.Width`

where`x`

takes ten different values.

then you are probably looking at data embedded in column names, which is a bad idea. Read about tidy data and then reshape to long format.

Data frames and data.tables are well-designed for tabular data, where rows correspond to observations and columns to variables. If you find yourself wanting to summarize over rows, like

Find the standard deviation across columns for each row.

then you should probably be using a matrix or some other data format entirely.