import pandas as pd
df = pd.DataFrame(np.random.randn(5, 5), columns=list('ABCDE'))
To generate various summary statistics. For numeric values the number of non-NA/null values (count
), the mean (mean
), the standard deviation std
and values known as the five-number summary :
min
: minimum (smallest observation)25%
: lower quartile or first quartile (Q1)50%
: median (middle value, Q2)75%
: upper quartile or third quartile (Q3)max
: maximum (largest observation)>>> df.describe()
A B C D E
count 5.000000 5.000000 5.000000 5.000000 5.000000
mean -0.456917 -0.278666 0.334173 0.863089 0.211153
std 0.925617 1.091155 1.024567 1.238668 1.495219
min -1.494346 -2.031457 -0.336471 -0.821447 -2.106488
25% -1.143098 -0.407362 -0.246228 -0.087088 -0.082451
50% -0.536503 -0.163950 -0.004099 1.509749 0.313918
75% 0.092630 0.381407 0.120137 1.822794 1.060268
max 0.796729 0.828034 2.137527 1.891436 1.870520