Java Language Pitfall - String concatenation in a loop does not scale


Example

Consider the following code as an illustration:

public String joinWords(List<String> words) {
    String message = "";
    for (String word : words) {
        message = message + " " + word;
    }
    return message;
}

Unfortunate this code is inefficient if the words list is long. The root of the problem is this statement:

message = message + " " + word;

For each loop iteration, this statement creates a new message string containing a copy of all characters in the original message string with extra characters appended to it. This generates a lot of temporary strings, and does a lot of copying.

When we analyse joinWords, assuming that there are N words with an average length of M, we find that O(N) temporary strings are created and O(M.N2) characters will be copied in the process. The N2 component is particularly troubling.

The recommended approach for this kind of problem1 is to use a StringBuilder instead of string concatenation as follows:

public String joinWords2(List<String> words) {
    StringBuilder message = new StringBuilder();
    for (String word : words) {
        message.append(" ").append(word);
    }
    return message.toString();
}

The analysis of joinWords2 needs to take account of the overheads of "growing" the StringBuilder backing array that holds the builder's characters. However, it turns out that the number of new objects created is O(logN) and that the number of characters copied is O(M.N) characters. The latter includes characters copied in the final toString() call.

(It may be possible to tune this further, by creating the StringBuilder with the correct capacity to start with. However, the overall complexity remains the same.)

Returning to the original joinWords method, it turns out that the critical statement will be optimized by a typical Java compiler to something like this:

  StringBuilder tmp = new StringBuilder();
  tmp.append(message).append(" ").append(word);
  message = tmp.toString();

However, the Java compiler will not "hoist" the StringBuilder out of the loop, as we did by hand in the code for joinWords2.

Reference:


1 - In Java 8 and later, the Joiner class can be used to solve this particular problem. However, that is not what this example is really supposed to be about.