pandas Tutorial => Reading csv file into DataFrame

Example

Example for reading file data_file.csv such as:

File:

index,header1,header2,header3
1,str_data,12,1.4
3,str_data,22,42.33
4,str_data,2,3.44
2,str_data,43,43.34

7, str_data, 25, 23.32

Code:

pd.read_csv('data_file.csv')

Output:

   index    header1  header2  header3
0      1   str_data       12     1.40
1      3   str_data       22    42.33
2      4   str_data        2     3.44
3      2   str_data       43    43.34
4      7   str_data       25    23.32

Some useful arguments:

sep The default field delimiter is a comma ,. Use this option if you need a different delimiter, for instance pd.read_csv('data_file.csv', sep=';')

index_col With index_col = n (n an integer) you tell pandas to use column n to index the DataFrame. In the above example:

pd.read_csv('data_file.csv',  index_col=0)

Output:

          header1  header2  header3
index
 1       str_data       12     1.40
 3       str_data       22    42.33
 4       str_data        2     3.44
 2       str_data       43    43.34
 7       str_data       25    23.32

skip_blank_lines By default blank lines are skipped. Use skip_blank_lines=False to include blank lines (they will be filled with NaN values)

pd.read_csv('data_file.csv',  index_col=0,skip_blank_lines=False)

Output:

         header1  header2  header3
index
 1      str_data       12     1.40
 3      str_data       22    42.33
 4      str_data        2     3.44
 2      str_data       43    43.34
NaN          NaN      NaN      NaN
 7      str_data       25    23.32

parse_dates Use this option to parse date data.

File:

date_begin;date_end;header3;header4;header5
1/1/2017;1/10/2017;str_data;1001;123,45
2/1/2017;2/10/2017;str_data;1001;67,89
3/1/2017;3/10/2017;str_data;1001;0

Code to parse columns 0 and 1 as dates:

pd.read_csv('f.csv', sep=';', parse_dates=[0,1])

Output:

  date_begin   date_end   header3  header4 header5
0 2017-01-01 2017-01-10  str_data     1001  123,45
1 2017-02-01 2017-02-10  str_data     1001   67,89
2 2017-03-01 2017-03-10  str_data     1001       0

By default, the date format is inferred. If you want to specify a date format you can use for instance

dateparse = lambda x: pd.datetime.strptime(x, '%d/%m/%Y')
pd.read_csv('f.csv', sep=';',parse_dates=[0,1],date_parser=dateparse)

Output:

  date_begin   date_end   header3  header4 header5
0 2017-01-01 2017-10-01  str_data     1001  123,45
1 2017-01-02 2017-10-02  str_data     1001   67,89
2 2017-01-03 2017-10-03  str_data     1001       0

More information on the function's parameters can be found in the official documentation.

PDF - Download pandas for free

Previous Next

pandas

Fastest Entity Framework Extensions

Example

File:

Code:

Output:

Some useful arguments:

Got any pandas Question?

pandas

pandas Pandas IO tools (reading and saving data sets) Reading csv file into DataFrame

Fastest Entity Framework Extensions

Example

File:

Code:

Output:

Some useful arguments:

Got any pandas Question?