A Stream
is a sequence of elements upon which sequential and parallel aggregate operations can be performed. Any given Stream
can potentially have an unlimited amount of data flowing through it. As a result, data received from a Stream
is processed individually as it arrives, as opposed to performing batch processing on the data altogether. When combined with lambda expressions they provide a concise way to perform operations on sequences of data using a functional approach.
Example: (see it work on Ideone)
Stream<String> fruitStream = Stream.of("apple", "banana", "pear", "kiwi", "orange");
fruitStream.filter(s -> s.contains("a"))
.map(String::toUpperCase)
.sorted()
.forEach(System.out::println);
Output:
APPLE
BANANA
ORANGE
PEAR
The operations performed by the above code can be summarized as follows:
Create a Stream<String>
containing a sequenced ordered Stream
of fruit String
elements using the static factory method Stream.of(values)
.
The filter()
operation retains only elements that match a given predicate (the elements that when tested by the predicate return true). In this case, it retains the elements containing an "a"
. The predicate is given as a lambda expression.
The map()
operation transforms each element using a given function, called a mapper. In this case, each fruit String
is mapped to its uppercase String
version using the method-reference String::toUppercase
.
Note that the
map()
operation will return a stream with a different generic type if the mapping function returns a type different to its input parameter. For example on aStream<String>
calling.map(String::isEmpty)
returns aStream<Boolean>
The sorted()
operation sorts the elements of the Stream
according to their natural ordering (lexicographically, in the case of String
).
Finally, the forEach(action)
operation performs an action which acts on each element of the Stream
, passing it to a Consumer. In the example, each element is simply being printed to the console. This operation is a terminal operation, thus making it impossible to operate on it again.
Note that operations defined on the
Stream
are performed because of the terminal operation. Without a terminal operation, the stream is not processed. Streams can not be reused. Once a terminal operation is called, theStream
object becomes unusable.
Operations (as seen above) are chained together to form what can be seen as a query on the data.
Note that a
Stream
generally does not have to be closed. It is only required to close streams that operate on IO channels. MostStream
types don't operate on resources and therefore don't require closing.
The Stream
interface extends AutoCloseable
. Streams can be closed by calling the close
method or by using try-with-resource statements.
An example use case where a Stream
should be closed is when you create a Stream
of lines from a file:
try (Stream<String> lines = Files.lines(Paths.get("somePath"))) {
lines.forEach(System.out::println);
}
The Stream
interface also declares the Stream.onClose()
method which allows you to register Runnable
handlers which will be called when the stream is closed. An example use case is where code which produces a stream needs to know when it is consumed to perform some cleanup.
public Stream<String>streamAndDelete(Path path) throws IOException {
return Files.lines(path).onClose(() -> someClass.deletePath(path));
}
The run handler will only execute if the close()
method gets called, either explicitly or implicitly by a try-with-resources statement.
A Stream
object's processing can be sequential or parallel.
In a sequential mode, the elements are processed in the order of the source of the Stream
. If the Stream
is ordered (such as a SortedMap
implementation or a List
) the processing is guaranteed to match the ordering of the source. In other cases, however, care should be taken not to depend on the ordering (see: is the Java HashMap
keySet()
iteration order consistent?).
Example:
List<Integer> integerList = Arrays.asList(0, 1, 2, 3, 42);
// sequential
long howManyOddNumbers = integerList.stream()
.filter(e -> (e % 2) == 1)
.count();
System.out.println(howManyOddNumbers); // Output: 2
Parallel mode allows the use of multiple threads on multiple cores but there is no guarantee of the order in which elements are processed.
If multiple methods are called on a sequential Stream
, not every method has to be invoked. For example, if a Stream
is filtered and the number of elements is reduced to one, a subsequent call to a method such as sort
will not occur. This can increase the performance of a sequential Stream
— an optimization that is not possible with a parallel Stream
.
Example:
// parallel
long howManyOddNumbersParallel = integerList.parallelStream()
.filter(e -> (e % 2) == 1)
.count();
System.out.println(howManyOddNumbersParallel); // Output: 2
While some actions can be performed on both Containers and Streams, they ultimately serve different purposes and support different operations. Containers are more focused on how the elements are stored and how those elements can be accessed efficiently. A Stream
, on the other hand, doesn't provide direct access and manipulation to its elements; it is more dedicated to the group of objects as a collective entity and performing operations on that entity as a whole. Stream
and Collection
are separate high-level abstractions for these differing purposes.