pandas Tutorial => Basic grouping

Example

Group by one column

Using the following DataFrame

df = pd.DataFrame({'A': ['a', 'b', 'c', 'a', 'b', 'b'], 
                   'B': [2, 8, 1, 4, 3, 8], 
                   'C': [102, 98, 107, 104, 115, 87]})

df
# Output: 
#    A  B    C
# 0  a  2  102
# 1  b  8   98
# 2  c  1  107
# 3  a  4  104
# 4  b  3  115
# 5  b  8   87

Group by column A and get the mean value of other columns:

df.groupby('A').mean()
# Output: 
#           B    C
# A               
# a  3.000000  103
# b  6.333333  100
# c  1.000000  107

Group by multiple columns

df.groupby(['A','B']).mean()
# Output: 
#          C
# A B       
# a 2  102.0
#   4  104.0
# b 3  115.0
#   8   92.5
# c 1  107.0

Note how after grouping each row in the resulting DataFrame is indexed by a tuple or MultiIndex (in this case a pair of elements from columns A and B).

To apply several aggregation methods at once, for instance to count the number of items in each group and compute their mean, use the agg function:

df.groupby(['A','B']).agg(['count', 'mean'])
# Output:
#         C       
#     count   mean
# A B             
# a 2     1  102.0
#   4     1  104.0
# b 3     1  115.0
#   8     2   92.5
# c 1     1  107.0

PDF - Download pandas for free

Previous Next

pandas

Fastest Entity Framework Extensions

Example

Group by one column

Group by multiple columns

Got any pandas Question?

pandas

pandas Grouping Data Basic grouping

Fastest Entity Framework Extensions

Example

Group by one column

Group by multiple columns

Got any pandas Question?