A data race or race condition is a problem that can occur when a multithreaded program is not properly synchronized. If two or more threads access the same memory without synchronization, and at least one of the accesses is a 'write' operation, a data race occurs. This leads to platform dependent, possibly inconsistent behavior of the program. For example, the result of a calculation could depend on the thread scheduling.
writer_thread {
write_to(buffer)
}
reader_thread {
read_from(buffer)
}
A simple solution:
writer_thread {
lock(buffer)
write_to(buffer)
unlock(buffer)
}
reader_thread {
lock(buffer)
read_from(buffer)
unlock(buffer)
}
This simple solution works well if there is only one reader thread, but if there is more than one, it slows down the execution unnecessarily, because the reader threads could read simultaneously.
A solution that avoids this problem could be:
writer_thread {
lock(reader_count)
if(reader_count == 0) {
write_to(buffer)
}
unlock(reader_count)
}
reader_thread {
lock(reader_count)
reader_count = reader_count + 1
unlock(reader_count)
read_from(buffer)
lock(reader_count)
reader_count = reader_count - 1
unlock(reader_count)
}
Note that reader_count
is locked throughout the whole writing operation, such that no reader can begin reading while the writing has not finished.
Now many readers can read simultaneously, but a new problem may arise: The reader_count
may never reach 0
, such that the writer thread can never write to the buffer. This is called starvation, there are different solutions to avoid it.
Even programs that may seem correct can be problematic:
boolean_variable = false
writer_thread {
boolean_variable = true
}
reader_thread {
while_not(boolean_variable)
{
do_something()
}
}
The example program might never terminate, since the reader thread might never see the update from the writer thread. If for example the hardware uses CPU caches, the values might be cached. And since a write or read to a normal field, does not lead to a refresh of the cache, the changed value might never be seen by the reading thread.
C++ and Java defines in the so called memory model, what properly synchronized means: C++ Memory Model, Java Memory Model.
In Java a solution would be to declare the field as volatile:
volatile boolean boolean_field;
In C++ a solution would be to declare the field as atomic:
std::atomic<bool> data_ready(false)
A data race is a kind of race condition. But not all race conditions are data races. The following called by more than one thread leads to a race condition but not to a data race:
class Counter {
private volatile int count = 0;
public void addOne() {
i++;
}
}
It is correctly synchronized according to the Java Memory Model specification, therefore it is not data race. But still it leads to a race conditions, e.g. the result depends on the interleaving of the threads.
Not all data races are bugs. An example of an so called benign race condition is the sun.reflect.NativeMethodAccessorImpl:
class NativeMethodAccessorImpl extends MethodAccessorImpl {
private Method method;
private DelegatingMethodAccessorImpl parent;
private int numInvocations;
NativeMethodAccessorImpl(Method method) {
this.method = method;
}
public Object invoke(Object obj, Object[] args)
throws IllegalArgumentException, InvocationTargetException
{
if (++numInvocations > ReflectionFactory.inflationThreshold()) {
MethodAccessorImpl acc = (MethodAccessorImpl)
new MethodAccessorGenerator().
generateMethod(method.getDeclaringClass(),
method.getName(),
method.getParameterTypes(),
method.getReturnType(),
method.getExceptionTypes(),
method.getModifiers());
parent.setDelegate(acc);
}
return invoke0(method, obj, args);
}
...
}
Here the performance of the code is more important than the correctness of the count of numInvocation.