The +
symbol can mean three distinct operators in Java:
+
, then it is the unary Plus operator.String
, then it it the binary Concatenation operator.In the simple case, the Concatenation operator joins two strings to give a third string. For example:
String s1 = "a String";
String s2 = "This is " + s1; // s2 contains "This is a String"
When one of the two operands is not a string, it is converted to a String
as follows:
An operand whose type is a primitive type is converted as if by calling toString()
on the boxed value.
An operand whose type is a reference type is converted by calling the operand's toString()
method. If the operand is null
, or if the toString()
method returns null
, then the string literal "null"
is used instead.
For example:
int one = 1;
String s3 = "One is " + one; // s3 contains "One is 1"
String s4 = null + " is null"; // s4 contains "null is null"
String s5 = "{1} is " + new int[]{1}; // s5 contains something like
// "{} is [I@xxxxxxxx"
The explanation for the s5
example is that the toString()
method on array types is inherited from java.lang.Object
, and the behavior is to produce a string that consists of the type name, and the object's identity hashcode.
The Concatenation operator is specified to create a new String
object, except in the case where the expression is a Constant Expression. In the latter case, the expression is evaluated at compile type, and its runtime value is equivalent to a string literal. This means that there is no runtime overhead in splitting a long string literal like this:
String typing = "The quick brown fox " +
"jumped over the " +
"lazy dog"; // constant expression
As noted above, with the exception of constant expressions, each string concatenation expression creates a new String
object. Consider this code:
public String stars(int count) {
String res = "";
for (int i = 0; i < count; i++) {
res = res + "*";
}
return res;
}
In the method above, each iteration of the loop will create a new String
that is one character longer than the previous iteration. Each concatenation copies all of the characters in the operand strings to form the new String
. Thus, stars(N)
will:
N
new String
objects, and throw away all but the last one,N * (N + 1) / 2
characters, andO(N^2)
bytes of garbage.This is very expensive for large N
. Indeed, any code that concatenates strings in a loop is liable to have this problem. A better way to write this would be as follows:
public String stars(int count) {
// Create a string builder with capacity 'count'
StringBuilder sb = new StringBuilder(count);
for (int i = 0; i < count; i++) {
sb.append("*");
}
return sb.toString();
}
Ideally, you should set the capacity of the StringBuilder
, but if this is not practical, the class will automatically grow the backing array that the builder uses to hold characters. (Note: the implementation expands the backing array exponentially. This strategy keeps that amount of character copying to a O(N)
rather than O(N^2)
.)
Some people apply this pattern to all string concatenations. However, this is unnecessary because the JLS allows a Java compiler to optimize string concatenations within a single expression. For example:
String s1 = ...;
String s2 = ...;
String test = "Hello " + s1 + ". Welcome to " + s2 + "\n";
will typically be optimized by the bytecode compiler to something like this;
StringBuilder tmp = new StringBuilder();
tmp.append("Hello ")
tmp.append(s1 == null ? "null" + s1);
tmp.append("Welcome to ");
tmp.append(s2 == null ? "null" + s2);
tmp.append("\n");
String test = tmp.toString();
(The JIT compiler may optimize that further if it can deduce that s1
or s2
cannot be null
.) But note that this optimization is only permitted within a single expression.
In short, if you are concerned about the efficiency of string concatenations: