C++ Conversion to std::wstring


Example

In C++, sequences of characters are represented by specializing the std::basic_string class with a native character type. The two major collections defined by the standard library are std::string and std::wstring:

  • std::string is built with elements of type char

  • std::wstring is built with elements of type wchar_t

To convert between the two types, use wstring_convert:

#include <string>
#include <codecvt>
#include <locale>

std::string input_str = "this is a -string-, which is a sequence based on the -char- type.";
std::wstring input_wstr = L"this is a -wide- string, which is based on the -wchar_t- type.";

// conversion
std::wstring str_turned_to_wstr = std::wstring_convert<std::codecvt_utf8<wchar_t>>().from_bytes(input_str);

std::string wstr_turned_to_str = std::wstring_convert<std::codecvt_utf8<wchar_t>>().to_bytes(input_wstr);

In order to improve usability and/or readability, you can define functions to perform the conversion:

#include <string>
#include <codecvt>
#include <locale>

using convert_t = std::codecvt_utf8<wchar_t>;
std::wstring_convert<convert_t, wchar_t> strconverter;

std::string to_string(std::wstring wstr)
{
    return strconverter.to_bytes(wstr);
}

std::wstring to_wstring(std::string str)
{
    return strconverter.from_bytes(str);
}

Sample usage:

std::wstring a_wide_string = to_wstring("Hello World!");

That's certainly more readable than std::wstring_convert<std::codecvt_utf8<wchar_t>>().from_bytes("Hello World!").


Please note that char and wchar_t do not imply encoding, and gives no indication of size in bytes. For instance, wchar_t is commonly implemented as a 2-bytes data type and typically contains UTF-16 encoded data under Windows (or UCS-2 in versions prior to Windows 2000) and as a 4-bytes data type encoded using UTF-32 under Linux. This is in contrast with the newer types char16_t and char32_t, which were introduced in C++11 and are guaranteed to be large enough to hold any UTF16 or UTF32 "character" (or more precisely, code point) respectively.