Consider this example:
public class ThreadTest implements Runnable {
private boolean stop = false;
public void run() {
long counter = 0;
while (!stop) {
counter = counter + 1;
}
System.out.println("Counted " + counter);
}
public static void main(String[] args) {
ThreadTest tt = new ThreadTest();
new Thread(tt).start(); // Create and start child thread
Thread.sleep(1000);
tt.stop = true; // Tell child thread to stop.
}
}
The intent of this program is intended to start a thread, let it run for 1000 milliseconds, and then cause it to stop by setting the stop
flag.
Maybe yes, may be no.
An application does not necessarily stop when the main
method returns. If another thread has been created, and that thread has not been marked as a daemon thread, then the application will continue to run after the main thread has ended. In this example, that means that the application will keep running until child thread ends. That should happens when tt.stop
is set to true
.
But that is actually not strictly true. In fact, the child thread will stop after it has observed stop
with the value true
. Will that happen? Maybe yes, maybe no.
The Java Language Specification guarantees that memory reads and writes made in a thread are visible to that thread, as per the order of the statements in the source code. However, in general, this is NOT guaranteed when one thread writes and another thread (subsequently) reads. To get guaranteed visibility, there needs to be a chain of happens-before relations between a write and a subsequent read. In the example above, there is no such chain for the update to the stop
flag, and therefore it is not guaranteed that the child thread will see stop
change to true
.
(Note to authors: There should be a separate Topic on the Java Memory Model to go into the deep technical details.)
In this case, there are two simple ways to ensure that the stop
update is visible:
Declare stop
to be volatile
; i.e.
private volatile boolean stop = false;
For a volatile
variable, the JLS specifies that there is a happens-before relation between a write by one thread and a later read by a second thread.
Use a mutex to synchronize as follows:
public class ThreadTest implements Runnable {
private boolean stop = false;
public void run() {
long counter = 0;
while (true) {
synchronize (this) {
if (stop) {
break;
}
}
counter = counter + 1;
}
System.out.println("Counted " + counter);
}
public static void main(String[] args) {
ThreadTest tt = new ThreadTest();
new Thread(tt).start(); // Create and start child thread
Thread.sleep(1000);
synchronize (tt) {
tt.stop = true; // Tell child thread to stop.
}
}
}
In addition to ensuring that there is mutual exclusion, the JLS specifies that there is a happens-before relation between the releasing a mutex in one thread and gaining the same mutex in a second thread.
Yes it is!
However, that fact does not mean that the effects of update will be visible simultaneously to all threads. Only a proper chain of happens-before relations will guarantee that.
Programmers doing multi-threaded programming in Java for the first time find the Memory Model is challenging. Programs behave in an unintuitive way because the natural expectation is that writes are visible uniformly. So why the Java designers design the Memory Model this way.
It actually comes down to a compromise between performance and ease of use (for the programmer).
A modern computer architecture consists of multiple processors (cores) with individual register sets. Main memory is accessible either to all processors or to groups of processors. Another property of modern computer hardware is that access to registers is typically orders of magnitude faster to access than access to main memory. As the number of cores scales up, it is easy to see that reading and writing to main memory can become a system's main performance bottleneck.
This mismatch is addressed by implementing one or more levels of memory caching between the processor cores and main memory. Each core access memory cells via its cache. Normally, a main memory read only happens when there is a cache miss, and a main memory write only happens when a cache line needs to be flushed. For an application where each core's working set of memory locations will fit into its cache, the core speed is no longer limited by main memory speed / bandwidth.
But that gives us a new problem when multiple cores are reading and writing shared variables. The latest version of a variable may sit in one core's cache. Unless the that core flushes the cache line to main memory, AND other cores invalidate their cached copy of older versions, some of them are liable to see stale versions of the variable. But if the caches were flushed to memory each time there is a cache write ("just in case" there was a read by another core) that would consume main memory bandwidth unnecessarily.
The standard solution used at the hardware instruction set level is to provide instructions for cache invalidation and a cache write-through, and leave it to the compiler to decide when to use them.
Returning to Java. the Memory Model is designed so that the Java compilers are not required to issue cache invalidation and write-through instructions where they are not really needed. The assumption is that the programmer will use an appropriate synchronization mechanism (e.g. primitive mutexes, volatile
, higher-level concurrency classes and so on) to indicate that it needs memory visibility. In the absence of a happens-before relation, the Java compilers are free to assume that no cache operations (or similar) are required.
This has significant performance advantages for multi-threaded applications, but the downside is that writing correct multi-threaded applications is not a simple matter. The programmer does have to understand what he or she is doing.
There are a number of reasons why problems like this are difficult to reproduce:
As explained above, the consequence of not dealing with memory visibility issues problems properly is typically that your compiled application does not handle the memory caches correctly. However, as we alluded to above, memory caches often get flushed anyway.
When you change the hardware platform, the characteristics of the memory caches may change. This can lead to different behavior if your application does not synchronize correctly.
You may be observing the effects of serendipitous synchronization. For example, if you add traceprints, their is typically some synchronization happening behind the scenes in the I/O streams that causes cache flushes. So adding traceprints often causes the application to behave differently.
Running an application under a debugger causes it to be compiled differently by the JIT compiler. Breakpoints and single stepping exacerbate this. These effects will often change the way an application behaves.
These things make bugs that are due to inadequate synchronization particularly difficult to solve.