Advanced Stream Ops

The Stream API is the heart of functional programming in Java. It allows you to process sequences of elements declaratively—saying what you want to do, rather than how to loop through them.

A Stream is not a data structure. It doesn’t store data. It’s a conveyor belt that moves data from a Source through a Pipeline of operations to a Terminal destination.

1. The Stream Pipeline

Every stream pipeline has three parts:

  1. Source: Collection, Array, I/O channel, etc.
  2. Intermediate Operations: filter, map, sorted (Lazy; return a new Stream).
  3. Terminal Operation: collect, forEach, reduce (Eager; produce a result or side-effect).

[!IMPORTANT] Intermediate operations are lazy. Nothing happens until the terminal operation is invoked. If you don’t call a terminal operation, the stream effectively does nothing.

2. Intermediate Operations

1. filter & map (The Bread and Butter)

  • filter(Predicate<T>): Keeps elements where the predicate is true.
  • map(Function<T, R>): Transforms each element.

2. flatMap (Flattening)

flatMap is used when each element maps to multiple new elements (or a stream of elements), and you want a single flat stream as a result. Think of it as “Map then Flatten”.

["A", "B"] ["C", "D"] Stream<Str> Stream<Str> Flattening... Stream: "A", "B", "C", "D"
// Example: List of Sentences -> List of Words
List<String> sentences = Arrays.asList("Hello World", "Java Streams");

List<String> words = sentences.stream()
    .flatMap(s -> Arrays.stream(s.split(" "))) // Stream<String[]> -> Stream<String>
    .toList();

// Output: ["Hello", "World", "Java", "Streams"]

3. State-Dependent Ops

  • distinct(): Removes duplicates (uses equals()).
  • sorted(): Sorts elements (natural order or Comparator).
  • skip(n) / limit(n): Skips or truncates the stream.

3. Interactive: Stream Pipeline Visualizer

Build a stream pipeline and watch how data flows and transforms through the marble diagram.

Stream Marble Diagram

Source
Op 1
Intermediate
Op 2
Result

4. Hardware Reality: Boxing Overhead

A common performance pitfall with Streams is autoboxing.

If you use Stream<Integer>, Java creates an object (Integer) for every number. This adds overhead (memory + GC).

The Fix: Primitive Streams

Java provides IntStream, LongStream, and DoubleStream to process primitives directly without boxing.

// BAD: Boxing Overhead (Stream<Integer>)
int sum = Stream.of(1, 2, 3, 4, 5)
    .reduce(0, Integer::sum);

// GOOD: No Boxing (IntStream)
int sum = IntStream.of(1, 2, 3, 4, 5)
    .sum();

[!TIP] Always prefer IntStream.range() over Stream.iterate() for numerical loops. It’s faster and avoids creating millions of Integer objects.

5. Collecting Results (Collectors)

collect() is the most common terminal operation. It gathers elements into a container.

// Basic Collection
List<String> list = stream.collect(Collectors.toList());
Set<String> set = stream.collect(Collectors.toSet());
String joined = stream.collect(Collectors.joining(", "));

Advanced Collectors: groupingBy

This is the SQL GROUP BY equivalent. It returns a Map<K, List<V>>.

List<String> words = Arrays.asList("apple", "banana", "apricot", "cherry");

// Group by first letter
Map<Character, List<String>> byLetter = words.stream()
    .collect(Collectors.groupingBy(s -> s.charAt(0)));

// Result: { 'a': ["apple", "apricot"], 'b': ["banana"], 'c': ["cherry"] }

Partitioning

partitioningBy is a special case of grouping where the key is a boolean.

// Split into even and odd
Map<Boolean, List<Integer>> oddEven = numbers.stream()
    .collect(Collectors.partitioningBy(n -> n % 2 == 0));

6. Comparisons: Java Stream vs. Go Loop

Java’s declarative style vs Go’s imperative style.

Java Stream

List<Integer> numbers = List.of(1, 2, 3, 4, 5, 6);

int sumSquares = numbers.stream()
    .filter(n -> n % 2 == 0) // Filter evens
    .mapToInt(n -> n * n)    // Square them
    .sum();                  // Sum result

// Result: 4 + 16 + 36 = 56

Go Loop

numbers := []int{1, 2, 3, 4, 5, 6}
sumSquares := 0

for _, n := range numbers {
    if n % 2 == 0 {       // Filter
        square := n * n   // Map
        sumSquares += square // Reduce
    }
}
// Go requires manual iteration and accumulation

7. The reduce Operation

When collect isn’t enough, reduce allows you to combine elements into a single result using an accumulator.

// Reduce(Initial Value, Accumulator)
int sum = numbers.stream()
    .reduce(0, (a, b) -> a + b);

[!TIP] Prefer specific reductions like sum(), max(), or collect() over generic reduce() when possible, as they are more readable and optimized.


Next Steps

Streams are powerful, but are they fast? In the next chapter, we’ll explore Parallel Streams and how to use multiple CPU cores to process data.