encryptionText-to-text encryption


Introduction

Modern cryptography operates on bytes, not text, so the output of cryptograhic algorithms is bytes. Sometimes encrypted data must be transferred via a text medium, and a binary-safe encoding must be used.

Parameters

ParameterDetails
TEText Encoding. The transformation from text to bytes. UTF-8 is a common choice.
BEBinary Encoding. A transform which is capable of processing any arbitrary data and producing a valid string. Base64 is the most commonly used encoding, with Base16/hexadecimal a good runner-up. Wikipedia has a list of candidate encodings (stick to ones labelled "Arbitrary").

Remarks

The general algorithm is:

Encrypt:

  • Transform InputText to InputBytes via encoding TE (text encoding).
  • Encrypt InputBytes to OutputBytes
  • Transform OutputBytes to OutputText via BE (binary encoding).

Decrypt (reverse BE and TE from Encrypt):

  • Transform InputText to InputBytes via encoding BE.
  • Decrypt InputBytes to OutputBytes
  • Transform OutputBytes to OutputText via TE.

The most common mistake is to choose a "text encoding" instead of a "binary encoding" for BE, which is a problem if any encrypted byte (or any IV byte) is outside the range 0x20-0x7E (for UTF-8 or ASCII). Since the "safe range" is less than half of the byte space the chances of a text encoding being successful are vanishingly small.

  • If post-encryption string contains a 0x00 then C/C++ programs will likely misinterpret that as the end of the string.
  • If a console-based program sees 0x08 it may erase the previous character (and the control code), making the InputText value to Decrypt have the wrong value (and the wrong length).