Consider the following example:
public class Example {
public int a, b, c, d;
public void doIt() {
a = b + 1;
c = d + 1;
}
}
If this class is used is a single-threaded application, then the observable behavior will be exactly as you would expect. For instance:
public class SingleThreaded {
public static void main(String[] args) {
Example eg = new Example();
System.out.println(eg.a + ", " + eg.c);
eg.doIt();
System.out.println(eg.a + ", " + eg.c);
}
}
will output:
0, 0
1, 1
As far as the "main" thread can tell, the statements in the main()
method and the doIt()
method will be executed in the order that they are written in the source code. This is a clear requirement of the Java Language Specification (JLS).
Now consider the same class used in a multi-threaded application.
public class MultiThreaded {
public static void main(String[] args) {
final Example eg = new Example();
new Thread(new Runnable() {
public void run() {
while (true) {
eg.doIt();
}
}
}).start();
while (true) {
System.out.println(eg.a + ", " + eg.c);
}
}
}
What will this print?
In fact, according to the JLS it is not possible to predict that this will print:
0, 0
to start with.N, N
or N, N + 1
.N + 1, N
.0, 0
lines continue forever1.1 - In practice the presence of the println
statements is liable to cause some serendipitous synchronization and memory cache flushing. That is likely to hide some of the effects that would cause the above behavior.
So how can we explain these?
One possible explanation for unexpected results is that the JIT compiler has changed the order of the assignments in the doIt()
method. The JLS requires that statements appear to execute in order from the perspective of the current thread. In this case, nothing in the code of the doIt()
method can observe the effect of a (hypothetical) reordering of those two statement. This means that the JIT compiler would be permitted to do that.
Why would it do that?
On typical modern hardware, machine instructions are executed using a instruction pipeline which allows a sequence of instructions to be in different stages. Some phases of instruction execution take longer than others, and memory operations tend to take a longer time. A smart compiler can optimize the instruction throughput of the pipeline by ordering the instructions to maximize the amount of overlap. This may lead to executing parts of statements out of order. The JLS permits this provided that not affect the result of the computation from the perspective of the current thread.
A second possible explanation is effect of memory caching. In a classical computer architecture, each processor has a small set of registers, and a larger amount of memory. Access to registers is much faster than access to main memory. In modern architectures, there are memory caches that are slower than registers, but faster than main memory.
A compiler will exploit this by trying to keep copies of variables in registers, or in the memory caches. If a variable does not need to be flushed to main memory, or does not need to be read from memory, there are significant performance benefits in not doing this. In cases where the JLS does not require memory operations to be visible to another thread, the Java JIT compiler is likely to not add the "read barrier" and "write barrier" instructions that will force main memory reads and writes. Once again, the performance benefits of doing this are significant.
So far, we have seen that the JLS allows the JIT compiler to generate code that makes single-threaded code faster by reordering or avoiding memory operations. But what happens when other threads can observe the state of the (shared) variables in main memory?
The answer is, that the other threads are liable to observe variable states which would appear to be impossible ... based on the code order of the Java statements. The solution to this is to use appropriate synchronization. The three main approaches are:
synchronized
constructs.volatile
variables.java.util.concurrent
packages.But even with this, it is important to understand where synchronization is needed, and what effects that you can rely on. This is where the Java Memory Model comes in.
The Java Memory Model is the section of the JLS that specifies the conditions under which one thread is guaranteed to see the effects of memory writes made by another thread. The Memory Model is specified with a fair degree of formal rigor, and (as a result) requires detailed and careful reading to understand. But the basic principle is that certain constructs create a "happens-before" relation between write of a variable by one thread, and a subsequent read of the same variable by another thread. If the "happens before" relation exists, the JIT compiler is obliged to generate code that will ensure that the read operation sees the value written by the write.
Armed with this, it is possible to reason about memory coherency in a Java program, and decide whether this will be predictable and consistent for all execution platforms.