Java Language Strings


Strings (java.lang.String) are pieces of text stored in your program. Strings are not a primitive data type in Java, however, they are very common in Java programs.

In Java, Strings are immutable, meaning that they cannot be changed. (Click here for a more thorough explanation of immutability.)


Since Java strings are immutable, all methods which manipulate a String will return a new String object. They do not change the original String. This includes to substring and replacement methods that C and C++ programers would expect to mutate the target String object.

Use a StringBuilder instead of String if you want to concatenate more than two String objects whose values cannot be determined at compile-time. This technique is more performant than creating new String objects and concatenating them because StringBuilder is mutable.

StringBuffer can also be used to concatenate String objects. However, this class is less performant because it is designed to be thread-safe, and acquires a mutex before each operation. Since you almost never need thread-safety when concatenating strings, it is best to use StringBuilder.

If you can express a string concatenation as a single expression, then it is better to use the + operator. The Java compiler will convert an expression containing + concatenations into an efficient sequence of operations using either String.concat(...) or StringBuilder. The advice to use StringBuilder explicitly only applies when the concatenation involves a multiple expressions.

Don't store sensitive information in strings. If someone is able to obtain a memory dump of your running application, then they will be able to find all of the existing String objects and read their contents. This includes String objects that are unreachable and are awaiting garbage collection. If this is a concern, you will need to wipe sensitive string data as soon as you are done with it. You cannot do this with String objects since they are immutable. Therefore, it is advisable to use a char[] objects to hold sensitive character data, and wipe them (e.g. overwrite them with '\000' characters) when you are done.

All String instances are created on the heap, even instances that correspond to string literals. The special thing about string literals is that the JVM ensures that all literals that are equal (i.e. that consists of the same characters) are represented by a single String object (this behavior is specified in JLS). This is implemented by JVM class loaders. When a class loader loads a class, it scans for string literals that are used in the class definition, each time it sees one, it checks if there is already a record in the string pool for this literal (using the literal as a key). If there is already an entry for the literal, the reference to a String instance stored as the pair for that literal is used. Otherwise, a new String instance is created and a reference to the instance is stored for the literal (used as a key) in the string pool. (Also see string interning).

The string pool is held in the Java heap, and is subject to normal garbage collection.

Java SE 7

In releases of Java before Java 7, the string pool was held in a special part of the heap known as "PermGen". This part was only collected occasionally.

Java SE 7

In Java 7, the string pool was moved off from "PermGen".

Note that string literals are implicitly reachable from any method that uses them. This means that the corresponding String objects can only be garbage collected if the code itself is garbage collected.

Up until Java 8, String objects are implemented as a UTF-16 char array (2 bytes per char). There is a proposal in Java 9 to implement String as a byte array with an encoding flag field to note if the string is encoded as bytes (LATIN-1) or chars (UTF-16).