Consider the following code as an illustration:
public String joinWords(List<String> words) {
String message = "";
for (String word : words) {
message = message + " " + word;
}
return message;
}
Unfortunate this code is inefficient if the words
list is long. The root of the problem is this statement:
message = message + " " + word;
For each loop iteration, this statement creates a new message
string containing a copy of all characters in the original message
string with extra characters appended to it. This generates a lot of temporary strings, and does a lot of copying.
When we analyse joinWords
, assuming that there are N words with an average length of M, we find that O(N) temporary strings are created and O(M.N2) characters will be copied in the process. The N2 component is particularly troubling.
The recommended approach for this kind of problem1 is to use a StringBuilder
instead of string concatenation as follows:
public String joinWords2(List<String> words) {
StringBuilder message = new StringBuilder();
for (String word : words) {
message.append(" ").append(word);
}
return message.toString();
}
The analysis of joinWords2
needs to take account of the overheads of "growing" the StringBuilder
backing array that holds the builder's characters. However, it turns out that the number of new objects created is O(logN) and that the number of characters copied is O(M.N) characters. The latter includes characters copied in the final toString()
call.
(It may be possible to tune this further, by creating the StringBuilder
with the correct capacity to start with. However, the overall complexity remains the same.)
Returning to the original joinWords
method, it turns out that the critical statement will be optimized by a typical Java compiler to something like this:
StringBuilder tmp = new StringBuilder();
tmp.append(message).append(" ").append(word);
message = tmp.toString();
However, the Java compiler will not "hoist" the StringBuilder
out of the loop, as we did by hand in the code for joinWords2
.
Reference:
1 - In Java 8 and later, the Joiner
class can be used to solve this particular problem. However, that is not what this example is really supposed to be about.