numpy Tutorial => Reading CSV files

Example

Three main functions available (description from man pages):

fromfile - A highly efficient way of reading binary data with a known data-type, as well as parsing simply formatted text files. Data written using the tofile method can be read using this function.

genfromtxt - Load data from a text file, with missing values handled as specified. Each line past the first skip_header lines is split at the delimiter character, and characters following the comments character are discarded.

loadtxt - Load data from a text file. Each row in the text file must have the same number of values.

genfromtxt is a wrapper function for loadtxt. genfromtxt is the most straight-forward to use as it has many parameters for dealing with the input file.

Consistent number of columns, consistent data type (numerical or string):

Given an input file, myfile.csv with the contents:

#descriptive text line to skip
1.0, 2, 3
4, 5.5, 6

import numpy as np
np.genfromtxt('path/to/myfile.csv',delimiter=',',skiprows=1)

gives an array:

array([[ 1. ,  2. ,  3. ],
       [ 4. ,  5.5,  6. ]])

Consistent number of columns, mixed data type (across columns):

1   2.0000  buckle_my_shoe
3   4.0000  margery_door

import numpy as np
np.genfromtxt('filename', dtype= None)


array([(1, 2.0, 'buckle_my_shoe'), (3, 4.0, 'margery_door')], 
dtype=[('f0', '<i4'), ('f1', '<f8'), ('f2', '|S14')])

Note the use of dtype=None results in a recarray.

Inconsistent number of columns:

file: 1 2 3 4 5 6 7 8 9 10 11 22 13 14 15 16 17 18 19 20 21 22 23 24

Into single row array:

result=np.fromfile(path_to_file,dtype=float,sep="\t",count=-1)

PDF - Download numpy for free

Previous Next

numpy

Fastest Entity Framework Extensions

Example

Got any numpy Question?

numpy

numpy File IO with numpy Reading CSV files

Fastest Entity Framework Extensions

Example

Got any numpy Question?