Tutorial by Examples

# Create a sample DF df = pd.DataFrame(np.random.randn(5, 3), columns=list('ABC')) # Show DF df A B C 0 -0.467542 0.469146 -0.861848 1 -0.823205 -0.167087 -0.759942 2 -1.508202 1.361894 -0.166701 3 0.394143 -0.287349 -0.978102 4 -0.160431 1.054736 -0.785250 ...
The iloc (short for integer location) method allows to select the rows of a dataframe based on their position index. This way one can slice dataframes just like one does with Python's list slicing. df = pd.DataFrame([[11, 22], [33, 44], [55, 66]], index=list("abc")) df # Out: # 0...
When using labels, both the start and the stop are included in the results. import pandas as pd import numpy as np np.random.seed(5) df = pd.DataFrame(np.random.randint(100, size=(5, 5)), columns = list("ABCDE"), index = ["R" + str(i) for i in range(5)]) ...
DataFrame: import pandas as pd import numpy as np np.random.seed(5) df = pd.DataFrame(np.random.randint(100, size=(5, 5)), columns = list("ABCDE"), index = ["R" + str(i) for i in range(5)]) df Out[12]: A B C D E R0 99 78 61 16 73...
One can select rows and columns of a dataframe using boolean arrays. import pandas as pd import numpy as np np.random.seed(5) df = pd.DataFrame(np.random.randint(100, size=(5, 5)), columns = list("ABCDE"), index = ["R" + str(i) for i in range(5)]) print (...
generate sample DF In [39]: df = pd.DataFrame(np.random.randint(0, 10, size=(5, 6)), columns=['a10','a20','a25','b','c','d']) In [40]: df Out[40]: a10 a20 a25 b c d 0 2 3 7 5 4 7 1 3 1 5 7 2 6 2 7 4 9 0 8 7 3 5 8 8 9 6 8 4 8 1 ...
import pandas as pd generate random DF df = pd.DataFrame(np.random.randint(0,10,size=(10, 3)), columns=list('ABC')) In [16]: print(df) A B C 0 4 1 4 1 0 2 0 2 7 8 8 3 2 1 9 4 7 3 8 5 4 0 7 6 1 5 5 7 6 7 8 8 6 7 3 9 6 4 5 select rows where value...
It may become necessary to traverse the elements of a series or the rows of a dataframe in a way that the next element or next row is dependent on the previously selected element or row. This is called path dependency. Consider the following time series s with irregular frequency. #starting pytho...
To view the first or last few records of a dataframe, you can use the methods head and tail To return the first n rows use DataFrame.head([n]) df.head(n) To return the last n rows use DataFrame.tail([n]) df.tail(n) Without the argument n, these functions return 5 rows. Note that the slice ...
Let df = pd.DataFrame({'col_1':['A','B','A','B','C'], 'col_2':[3,4,3,5,6]}) df # Output: # col_1 col_2 # 0 A 3 # 1 B 4 # 2 A 3 # 3 B 5 # 4 C 6 To get the distinct values in col_1 you can use Series.unique() df['col_1'].unique() # Output: ...
If you have a dataframe with missing data (NaN, pd.NaT, None) you can filter out incomplete rows df = pd.DataFrame([[0,1,2,3], [None,5,None,pd.NaT], [8,None,10,None], [11,12,13,pd.NaT]],columns=list('ABCD')) df # Output: # A B ...

Page 1 of 1