Java Language Pitfall - Over-use of primitive wrapper types is inefficient


Example

Consider these two pieces of code:

int a = 1000;
int b = a + 1;

and

Integer a = 1000;
Integer b = a + 1;

Question: Which version is more efficient?

Answer: The two versions look almost the identical, but the first version is a lot more efficient than the second one.

The second version is using a representation for the numbers that uses more space, and is relying on auto-boxing and auto-unboxing behind the scenes. In fact the second version is directly equivalent to the following code:

Integer a = Integer.valueOf(1000);               // box 1000
Integer b = Integer.valueOf(a.intValue() + 1);   // unbox 1000, add 1, box 1001

Comparing this to the other version that uses int, there are clearly three extra method calls when Integer is used. In the case of valueOf, the calls are each going to create and initialize a new Integer object. All of this extra boxing and unboxing work is likely to make the second version an order of magnitude slower than the first one.

In addition to that, the second version is allocating objects on the heap in each valueOf call. While the space utilization is platform specific, it is likely to be in the region of 16 bytes for each Integer object. By contrast, the int version needs zero extra heap space, assuming that a and b are local variables.


Another big reason why primitives are faster then their boxed equivalent is how their respective array types are laid out in memory.

If you take int[] and Integer[] as an example, in the case of an int[] the int values are contiguously laid out in memory. But in the case of an Integer[] it's not the values that are laid out, but references (pointers) to Integer objects, which in turn contain the actual int values.

Besides being an extra level of indirection, this can be a big tank when it comes to cache locality when iterating over the values. In the case of an int[] the CPU could fetch all the values in the array, into it's cache at once, because they are contiguous in memory. But in the case of an Integer[] the CPU potentially has to do an additional memory fetch for each element, since the array only contains references to the actual values.


In short, using primitive wrapper types is relatively expensive in both CPU and memory resources. Using them unnecessarily is in efficient.