Different threads trying to access the same memory location participate in a data race if at least one of the operations is a modification (also known as store operation). These data races cause undefined behavior. To avoid them one needs to prevent these threads from concurrently executing such conflicting operations.
Synchronization primitives (mutex, critical section and the like) can guard such accesses. The Memory Model introduced in C++11 defines two new portable ways to synchronize access to memory in multi-threaded environment: atomic operations and fences.
It is now possible to read and write to given memory location by the use of atomic load and atomic store operations. For convenience these are wrapped in the
std::atomic<t> template class. This class wraps a value of type
t but this time loads and stores to the object are atomic.
The template is not available for all types. Which types are available is implementation specific, but this usually includes most (or all) available integral types as well as pointer types. So that
std::atomic<std::vector<foo> *> should be available, while
std::atomic<std::pair<bool,char>> most probably wont be.
Atomic operations have the following properties:
std::memory_orderparameter which defines what additional properties the operation has regarding other memory locations.
|no additional restrictions|
These memory order tags allow three different memory ordering disciplines: sequential consistency, relaxed, and release-acquire with its sibling release-consume.
If no memory order is specified for an atomic operation, the order defaults to sequential consistency. This mode can also be explicitly selected by tagging the operation with
With this order no memory operation can cross the atomic operation. All memory operations sequenced before the atomic operation happen before the atomic operation and the atomic operation happens before all memory operations that are sequenced after it. This mode is probably the easiest one to reason about but it also leads to the greatest penalty to performance. It also prevents all compiler optimizations that might otherwise try to reorder operations past the atomic operation.
The opposite to sequential consistency is the relaxed memory ordering. It is selected with the
std::memory_order_relaxed tag. Relaxed atomic operation will impose no restrictions on other memory operations. The only effect that remains, is that the operation is itself still atomic.
An atomic store operation can be tagged with
std::memory_order_release and an atomic load operation can be tagged with
std::memory_order_acquire. The first operation is called (atomic) store-release while the second is called (atomic) load-acquire.
When load-acquire sees the value written by a store-release the following happens: all store operations sequenced before the store-release become visible to (happen before) load operations that are sequenced after the load-acquire.
Atomic read-modify-write operations can also receive the cumulative tag
std::memory_order_acq_rel. This makes the atomic load portion of the operation an atomic load-acquire while the atomic store portion becomes atomic store-release.
The compiler is not allowed to move store operations after an atomic store-release operation. It is also not allowed to move load operations before atomic load-acquire (or load-consume).
Also note that there is no atomic load-release or atomic store-acquire. Attempting to create such operations makes them relaxed operations.
This combination is similar to release-acquire, but this time the atomic load is tagged with
std::memory_order_consume and becomes (atomic) load-consume operation. This mode is the same as release-acquire with the only difference that among the load operations sequenced after the load-consume only these depending on the value loaded by the load-consume are ordered.
Fences also allow memory operations to be ordered between threads. A fence is either a release fence or acquire fence.
If a release fence happens before an acquire fence, then stores sequenced before the release fence are visible to loads sequenced after the acquire fence. To guarantee that the release fence happens before the acquire fence one may use other synchronization primitives including relaxed atomic operations.