.NET Framework Convert string to/from another encoding


.NET strings contain System.Char (UTF-16 code-units). If you want to save (or manage) text with another encoding you have to work with an array of System.Byte.

Conversions are performed by classes derived from System.Text.Encoder and System.Text.Decoder which, together, can convert to/from another encoding (from a byte X encoded array byte[] to an UTF-16 encoded System.String and vice-versa).

Because the encoder/decoder usually works very close to each other they're grouped together in a class derived from System.Text.Encoding, derived classes offer conversions to/from popular encodings (UTF-8, UTF-16 and so on).


Convert a string to UTF-8

byte[] data = Encoding.UTF8.GetBytes("This is my text");

Convert UTF-8 data to a string

var text = Encoding.UTF8.GetString(data);

Change encoding of an existing text file

This code will read content of an UTF-8 encoded text file and save it back encoded as UTF-16. Note that this code is not optimal if file is big because it will read all its content into memory:

var content = File.ReadAllText(path, Encoding.UTF8);
File.WriteAllText(content, Encoding.UTF16);