If we need to parse a large file, e.g. a CSV more than 10 Mbytes containing millions of rows, some use file
or file_get_contents
functions and end up with hitting memory_limit
setting with
Allowed memory size of XXXXX bytes exhausted
error. Consider the following source (top-1m.csv has exactly 1 million rows and is about 22 Mbytes of size)
var_dump(memory_get_usage(true));
$arr = file('top-1m.csv');
var_dump(memory_get_usage(true));
This outputs:
int(262144)
int(210501632)
because the interpreter needed to hold all the rows in $arr
array, so it consumed ~200 Mbytes of RAM. Note that we haven't even done anything with the contents of the array.
Now consider the following code:
var_dump(memory_get_usage(true));
$index = 1;
if (($handle = fopen("top-1m.csv", "r")) !== FALSE) {
while (($row = fgetcsv($handle, 1000, ",")) !== FALSE) {
file_put_contents('top-1m-reversed.csv',$index . ',' . strrev($row[1]) . PHP_EOL, FILE_APPEND);
$index++;
}
fclose($handle);
}
var_dump(memory_get_usage(true));
which outputs
int(262144)
int(262144)
so we don't use a single extra byte of memory, but parse the whole CSV and save it to another file reversing the value of the 2nd column. That's because fgetcsv
reads only one row and $row
is overwritten in every loop.