Modern cryptography operates on bytes, not text, so the output of cryptograhic algorithms is bytes. Sometimes encrypted data must be transferred via a text medium, and a binary-safe encoding must be used.
Parameter | Details |
---|---|
TE | Text Encoding. The transformation from text to bytes. UTF-8 is a common choice. |
BE | Binary Encoding. A transform which is capable of processing any arbitrary data and producing a valid string. Base64 is the most commonly used encoding, with Base16/hexadecimal a good runner-up. Wikipedia has a list of candidate encodings (stick to ones labelled "Arbitrary"). |
The general algorithm is:
Encrypt
:
InputText
to InputBytes
via encoding TE
(text encoding).InputBytes
to OutputBytes
OutputBytes
to OutputText
via BE
(binary encoding).Decrypt
(reverse BE and TE from Encrypt
):
InputText
to InputBytes
via encoding BE
.InputBytes
to OutputBytes
OutputBytes
to OutputText
via TE
.The most common mistake is to choose a "text encoding" instead of a "binary encoding" for BE
, which is a problem if any encrypted byte (or any IV byte) is outside the range 0x20
-0x7E
(for UTF-8 or ASCII). Since the "safe range" is less than half of the byte space the chances of a text encoding being successful are vanishingly small.
0x00
then C/C++ programs will likely misinterpret that as the end of the string.0x08
it may erase the previous character (and the control code), making the InputText
value to Decrypt
have the wrong value (and the wrong length).