int x, y;
bool ready = false;
void init()
{
x = 2;
y = 3;
ready = true;
}
void use()
{
if (ready)
std::cout << x + y;
}
One thread calls the init()
function while another thread (or signal handler) calls the use()
function. One might expect that the use()
function will either print 5
or do nothing. This may not always be the case for several reasons:
The CPU may reorder the writes that happen in init()
so that the code that actually executes might look like:
void init()
{
ready = true;
x = 2;
y = 3;
}
The CPU may reorder the reads that happen in use()
so that the actually executed code might become:
void use()
{
int local_x = x;
int local_y = y;
if (ready)
std::cout << local_x + local_y;
}
An optimizing C++ compiler may decide to reorder the program in similar way.
Such reordering cannot change the behavior of a program running in single thread because a thread cannot interleave the calls to init()
and use()
. On the other hand in a multi-threaded setting one thread may see part of the writes performed by the other thread where it may happen that use()
may see ready==true
and garbage in x
or y
or both.
The C++ Memory Model allows the programmer to specify which reordering operations are permitted and which are not, so that a multi-threaded program would also be able to behave as expected. The example above can be rewritten in thread-safe way like this:
int x, y;
std::atomic<bool> ready{false};
void init()
{
x = 2;
y = 3;
ready.store(true, std::memory_order_release);
}
void use()
{
if (ready.load(std::memory_order_acquire))
std::cout << x + y;
}
Here init()
performs atomic store-release operation. This not only stores the value true
into ready
, but also tells the compiler that it cannot move this operation before write operations that are sequenced before it.
The use()
function does an atomic load-acquire operation. It reads the current value of ready
and also forbids the compiler from placing read operations that are sequenced after it to happen before the atomic load-acquire.
These atomic operations also cause the compiler to put whatever hardware instructions are needed to inform the CPU to refrain from the unwanted reorderings.
Because the atomic store-release is to the same memory location as the atomic load-acquire, the memory model stipulates that if the load-acquire operation sees the value written by the store-release operation, then all writes performed by init()
's thread prior to that store-release will be visible to loads that use()
's thread executes after its load-acquire. That is if use()
sees ready==true
, then it is guaranteed to see x==2
and y==3
.
Note that the compiler and the CPU are still allowed to write to y
before writing to x
, and similarly the reads from these variables in use()
can happen in any order.