When opening a text file, you may specify it's encoding explicitly with a three-argument open()
. This en-/decoder attached to a file handle is called an "I/O layer":
my $filename = '/path/to/file';
open my $fh, '<:encoding(utf-8)', $filename or die "Failed to open $filename: $!";
See Remarks for a discussion of the differences between :utf8
and :encoding(utf-8)
.
Alternatively, you may use binmode() to set the encoding for individual file handle:
my $filename = '/path/to/file';
open my $fh, '<', $filename or die "Failed to open $filename: $!";
binmode $fh, ':encoding(utf-8)';
To avoid setting encoding for each file handle separately, you may use the open
pragma to set a default I/O layer used by all subsequent calls to the open()
function and similar operators within the lexical scope of this pragma:
# Set input streams to ':encoding(utf-8)' and output streams to ':utf8'
use open (IN => ':encoding(utf-8)', OUT => ':utf8');
# Or to set all input and output streams to ':encoding(utf-8)'
use open ':encoding(utf-8)';
Finally, it is also possible to run the perl interpreter with a -CD
flag that applies UTF-8 as the default I/O layer. However, this option should be avoided since it relies on specific user behaviour which cannot be predicted nor controlled.