pandas Missing Data Filling missing values


Example

In [11]: df = pd.DataFrame([[1, 2, None, 3], [4, None, 5, 6], 
                            [7, 8, 9, 10], [None, None, None, None]])

Out[11]: 
     0    1    2     3
0  1.0  2.0  NaN   3.0
1  4.0  NaN  5.0   6.0
2  7.0  8.0  9.0  10.0
3  NaN  NaN  NaN   NaN

Fill missing values with a single value:

In [12]: df.fillna(0)
Out[12]: 
     0    1    2     3
0  1.0  2.0  0.0   3.0
1  4.0  0.0  5.0   6.0
2  7.0  8.0  9.0  10.0
3  0.0  0.0  0.0   0.0   

This returns a new DataFrame. If you want to change the original DataFrame, either use the inplace parameter (df.fillna(0, inplace=True)) or assign it back to original DataFrame (df = df.fillna(0)).

Fill missing values with the previous ones:

In [13]: df.fillna(method='pad')  # this is equivalent to both method='ffill' and .ffill()
Out[13]: 
     0    1    2     3
0  1.0  2.0  NaN   3.0
1  4.0  2.0  5.0   6.0
2  7.0  8.0  9.0  10.0
3  7.0  8.0  9.0  10.0

Fill with the next ones:

In [14]: df.fillna(method='bfill')  # this is equivalent to .bfill()
Out[14]: 
     0    1    2     3
0  1.0  2.0  5.0   3.0
1  4.0  8.0  5.0   6.0
2  7.0  8.0  9.0  10.0
3  NaN  NaN  NaN   NaN

Fill using another DataFrame:

In [15]: df2 = pd.DataFrame(np.arange(100, 116).reshape(4, 4))
         df2
Out[15]: 
     0    1    2    3
0  100  101  102  103
1  104  105  106  107
2  108  109  110  111
3  112  113  114  115

In [16]: df.fillna(df2) #  takes the corresponding cells in df2 to fill df
Out[16]: 
       0      1      2      3
0    1.0    2.0  102.0    3.0
1    4.0  105.0    5.0    6.0
2    7.0    8.0    9.0   10.0
3  112.0  113.0  114.0  115.0