# data.table Computing summary statistics Counting rows by group

## Example

``````# example data
DT = data.table(iris)
DT[, Bin := cut(Sepal.Length, c(4,6,8))]
``````

## Using `.N`

`.N` in `j` stores the number of rows in a subset. When exploring data, `.N` is handy to...

1. count rows in a group,

``````DT[Species == "setosa", .N]

# 50
``````
2. or count rows in all groups,

``````DT[, .N, by=.(Species, Bin)]

#       Species   Bin  N
# 1:     setosa (4,6] 50
# 2: versicolor (6,8] 20
# 3: versicolor (4,6] 30
# 4:  virginica (6,8] 41
# 5:  virginica (4,6]  9
``````
3. or find groups that have a certain number of rows.

``````DT[, .N, by=.(Species, Bin)][ N < 25 ]

#       Species   Bin  N
# 1: versicolor (6,8] 20
# 2:  virginica (4,6]  9
``````

## Handling missing groups

However, we are missing groups with a count of zero above. If they matter, we can use `table` from base:

``````DT[, data.table(table(Species, Bin))][ N < 25 ]

#       Species   Bin  N
# 1:  virginica (4,6]  9
# 2:     setosa (6,8]  0
# 3: versicolor (6,8] 20
``````

Alternately, we can join on all groups:

``````DT[CJ(Species=Species, Bin=Bin, unique=TRUE), on=c("Species","Bin"), .N, by=.EACHI][N < 25]

#       Species   Bin  N
# 1:     setosa (6,8]  0
# 2: versicolor (6,8] 20
# 3:  virginica (4,6]  9
``````

A note on `.N`:

• This example uses `.N` in `j`, where it refers to size of a subset.
• In `i`, it refers to the total number of rows. PDF - Download data.table for free