When a Java class overrides the equals
method, it should override the hashCode
method as well. As defined in the method's contract:
- Whenever it is invoked on the same object more than once during an execution of a Java application, the
hashCode
method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.- If two objects are equal according to the
equals(Object)
method, then calling thehashCode
method on each of the two objects must produce the same integer result.- It is not required that if two objects are unequal according to the
equals(Object)
method, then calling thehashCode
method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
Hash codes are used in hash implementations such as HashMap
, HashTable
, and HashSet
. The result of the hashCode
function determines the bucket in which an object will be put. These hash implementations are more efficient if the provided hashCode
implementation is good. An important property of good hashCode
implementation is that the distribution of the hashCode
values is uniform. In other words, there is a small probability that numerous instances will be stored in the same bucket.
An algorithm for computing a hash code value may be similar to the following:
public class Foo {
private int field1, field2;
private String field3;
public Foo(int field1, int field2, String field3) {
this.field1 = field1;
this.field2 = field2;
this.field3 = field3;
}
@Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj == null || getClass() != obj.getClass()) {
return false;
}
Foo f = (Foo) obj;
return field1 == f.field1 &&
field2 == f.field2 &&
(field3 == null ? f.field3 == null : field3.equals(f.field3);
}
@Override
public int hashCode() {
int hash = 1;
hash = 31 * hash + field1;
hash = 31 * hash + field2;
hash = 31 * hash + (field3 == null ? 0 : field3.hashCode());
return hash;
}
}
In Java 1.2 and above, instead of developing an algorithm to compute a hash code, one can be generated using java.util.Arrays#hashCode
by supplying an Object or primitives array containing the field values:
@Override
public int hashCode() {
return Arrays.hashCode(new Object[] {field1, field2, field3});
}
Java 1.7 introduced the java.util.Objects
class which provides a convenience method, hash(Object... objects)
, that computes a hash code based on the values of the objects supplied to it. This method works just like java.util.Arrays#hashCode
.
@Override
public int hashCode() {
return Objects.hash(field1, field2, field3);
}
Note: this approach is inefficient, and produces garbage objects each time your custom hashCode()
method is called:
Object[]
is created. (In the Objects.hash()
version, the array is created by the "varargs" mechanism.)Arrays.hashCode
or Objects.hash
method.Object.hashCode()
that Arrays.hashCode
or Objects.hash
has to make (probably) cannot be inlined.Since the calculation of an object's hash code can be expensive, it can be attractive to cache the hash code value within the object the first time that it is calculated. For example
public final class ImmutableArray {
private int[] array;
private volatile int hash = 0;
public ImmutableArray(int[] initial) {
array = initial.clone();
}
// Other methods
@Override
public boolean equals(Object obj) {
// ...
}
@Override
public int hashCode() {
int h = hash;
if (h == 0) {
h = Arrays.hashCode(array);
hash = h;
}
return h;
}
}
This approach trades off the cost of (repeatedly) calculating the hash code against the overhead of an extra field to cache the hash code. Whether this pays off as a performance optimization will depend on how often a given object is hashed (looked up) and other factors.
You will also notice that if the true hashcode of an ImmutableArray
happens to be zero (one chance in 232), the cache is ineffective.
Finally, this approach is much harder to implement correctly if the object we are hashing is mutable. However, there are bigger concerns if hash codes change; see the contract above.