Chapter 6 Flashcards — Lambdas and Streams

flashcards effective-java lambdas streams functional-programming java

What is a functional interface and how does it enable lambdas?
?
A functional interface is any interface with exactly one abstract method (it may have any number of default or static methods). This single abstract method defines the lambda’s signature — the compiler uses it as the target type for type inference. The @FunctionalInterface annotation causes the compiler to verify this constraint. Examples: Runnable (one abstract method: run()), Comparator<T> (compare()), Predicate<T> (test()). Any lambda is assignable to any compatible functional interface.

What are the three limitations of lambdas compared to anonymous classes?
?

No self-reference: There is no name for a lambda itself inside its body. You cannot call the lambda recursively or register it as a listener that removes itself. Anonymous classes have this referring to themselves.
Cannot implement multiple interfaces: A lambda targets exactly one functional interface. An anonymous class can implement multiple interfaces or extend an abstract class.
Cryptic stack traces: Lambda stack frames have synthetic names like $$Lambda$1/0x..., making debugging harder than named anonymous classes or named methods.
These limitations rarely matter — but when they do, use an anonymous class or extract a named method.

What are the four types of method references? Give a syntax example and equivalent lambda for each.
?

Type	Syntax	Lambda Equivalent
Static	`Integer::parseInt`	`s -> Integer.parseInt(s)`
Bound instance	`System.out::println`	`x -> System.out.println(x)`
Unbound instance	`String::toUpperCase`	`s -> s.toUpperCase()`
Constructor	`ArrayList::new`	`() -> new ArrayList<>()`

Static: receiver is the class. Bound: receiver is a captured instance. Unbound: receiver is the first argument. Constructor: creates a new object.

When should you prefer a lambda over a method reference?
?
Prefer a lambda when:

The method name is longer than a short lambda: () -> action() vs GoshThisClassHasAVeryLongName::action
The lambda’s parameter names provide documentation: (numerator, denominator) -> numerator / denominator is clearer than a cryptically named static method
The lambda transforms or combines arguments before passing them — method references can only forward arguments as-is
Otherwise, prefer method references — they leverage the existing method’s name for self-documentation.

What are the six primary standard functional interfaces in java.util.function?
?

Interface	Signature	Use case
`Predicate<T>`	`boolean test(T t)`	Boolean test of one argument
`Function<T, R>`	`R apply(T t)`	Transform T to R
`Supplier<T>`	`T get()`	Produce a T (no input)
`Consumer<T>`	`void accept(T t)`	Consume a T (no output)
`UnaryOperator<T>`	`T apply(T t)`	Transform T to same type T
`BinaryOperator<T>`	`T apply(T t1, T t2)`	Combine two T values into T
Memorize these six — everything else in `java.util.function` is a variation (Bi-forms, primitive specializations).

Why should you use primitive specializations of functional interfaces (e.g., IntPredicate vs Predicate<Integer>)?
?
Predicate<Integer> requires autoboxing every int value to an Integer object — wasteful heap allocation and GC pressure in tight loops. IntPredicate takes a primitive int directly, avoiding all boxing. The JDK provides specializations for int, long, and double:

IntPredicate, LongPredicate, DoublePredicate
IntFunction<R>, LongFunction<R>, DoubleFunction<R>
IntSupplier, LongSupplier, DoubleSupplier
IntConsumer, LongConsumer, DoubleConsumer
IntUnaryOperator, LongUnaryOperator, DoubleUnaryOperator
IntBinaryOperator, LongBinaryOperator, DoubleBinaryOperator
Always prefer these when working with primitive streams or primitive values.

When is it appropriate to define a custom functional interface instead of using one from java.util.function?
?
Define a custom @FunctionalInterface when:

The interface will be commonly used and benefits from a descriptive name (e.g., Comparator<T> is more descriptive than BiFunction<T, T, Integer>)
There is a strong contract that should be documented (e.g., Comparator’s total-ordering requirements)
The interface benefits from default methods (e.g., Comparator.thenComparing(), reversed())
The interface must declare a checked exception — standard functional interfaces cannot (Function<T,R> cannot throw IOException; a custom ThrowingFunction<T,R> can)
Otherwise, use the standard interface. Always annotate custom functional interfaces with @FunctionalInterface.

What does @FunctionalInterface do? Is it required?
?
@FunctionalInterface is a marker annotation that tells the compiler to:

Verify the interface has exactly one abstract method — if it has zero or more than one, it’s a compile error
Document intent — readers know the interface is designed for lambda use
It is not required for an interface to be used as a lambda target. Any single-abstract-method interface works. But you should always add it to interfaces you intend as functional interfaces — it catches accidental addition of abstract methods that would break existing lambdas.

What is the difference between map and flatMap in streams?
?

map(Function<T, R>) applies a function producing one output per input: Stream<T> → Stream<R>. The output is a stream of individual elements.
flatMap(Function<T, Stream<R>>) applies a function producing a stream per input, then flattens all streams into one: Stream<T> → Stream<R> (via flattening). Use when the mapping produces 0 or more results per element.

// map: one word → one length
words.stream().map(String::length) // [5, 5]
 
// flatMap: one word → many chars, then flatten
words.stream().flatMap(w -> Arrays.stream(w.split(""))) // [h,e,l,l,o,w,o,r,l,d]

What is lazy evaluation in streams and why does it matter?
?
Stream intermediate operations (filter, map, flatMap, sorted, limit, distinct) are lazy — they do not process any elements until a terminal operation (collect, forEach, count, findFirst, etc.) is called. This has two important consequences:

Short-circuiting: Operations like findFirst() or limit(n) can stop processing after the first match or N elements, never examining the rest of the source.
No intermediate materialization: A chain of filter().map().filter() does not create intermediate collections — elements flow through the pipeline one at a time (or in chunks for parallel).
This makes streams memory-efficient for large data sources and enables infinite streams (Stream.iterate()).

What is the correct use of forEach in a stream pipeline?
?
forEach should be used only for reporting results — printing, logging, writing to external storage — not for computation or accumulation. Side-effecting forEach in a stream defeats the purpose of the functional model and breaks parallelism:

// BAD: Side effect in stream — mutates external list
List<String> result = new ArrayList<>();
stream.forEach(s -> result.add(s)); // race condition in parallel!
 
// GOOD: Use a collector
List<String> result = stream.collect(Collectors.toList());
 
// GOOD: forEach for reporting only
result.forEach(System.out::println);

The rule: forEach should do nothing that changes the state of the program in a way that affects correctness. Logging and printing are acceptable; accumulation is not.

What does Stream.toList() return, and how does it differ from Collectors.toList()?
?

Collectors.toList() (Java 8+): Returns a mutable ArrayList. You can add, remove, and set elements.
Stream.toList() (Java 16+): Returns an unmodifiable List. Any attempt to mutate it throws UnsupportedOperationException.
Collectors.toUnmodifiableList() (Java 10+): Also unmodifiable — semantically equivalent to Stream.toList() but more verbose.

stream.toList()                              // unmodifiable (Java 16+) — prefer this
stream.collect(Collectors.toList())          // mutable ArrayList
stream.collect(Collectors.toUnmodifiableList()) // unmodifiable (Java 10+)

Default to stream.toList() (Java 16+) unless you specifically need a mutable list. Note: stream.toList() permits null elements; List.of(...) does not.

What are the essential collectors from Collectors? List at least 10.
?

toList() — mutable list; toUnmodifiableList() — immutable; Stream.toList() — immutable (Java 16)
toSet() — mutable set; toUnmodifiableSet()
toCollection(Supplier<C>) — specific collection type (e.g., TreeSet::new)
toMap(keyFn, valueFn) — throws on duplicate keys
toMap(keyFn, valueFn, mergeFn) — merge duplicates
groupingBy(classifier) — Map<K, List<V>>
groupingBy(classifier, downstream) — with downstream collector
partitioningBy(predicate) — Map<Boolean, List<T>>
counting() — count elements
joining(delimiter) — concatenate strings with separator
summarizingInt/Long/Double(fn) — stats object (count, sum, min, max, avg)
mapping(fn, downstream) — map then collect
filtering(pred, downstream) — filter then collect (Java 9+)
teeing(d1, d2, merger) — two collectors, one pass (Java 12+)

What does Collectors.groupingBy return, and how do you use a downstream collector with it?
?
groupingBy(classifier) returns a Map<K, List<V>> — the key is the classifier result, the value is a list of elements mapping to that key.
groupingBy(classifier, downstream) applies the downstream collector to each group instead of defaulting to toList():

// Count per group
Map<Department, Long> countByDept =
    employees.stream()
             .collect(Collectors.groupingBy(Employee::getDepartment, Collectors.counting()));
 
// Average salary per department
Map<Department, Double> avgByDept =
    employees.stream()
             .collect(Collectors.groupingBy(Employee::getDepartment,
                                            Collectors.averagingDouble(Employee::getSalary)));
 
// Collect only names per group (mapping downstream)
Map<Department, List<String>> namesByDept =
    employees.stream()
             .collect(Collectors.groupingBy(Employee::getDepartment,
                                            Collectors.mapping(Employee::getName, Collectors.toList())));

What is Collectors.teeing() and when was it added?
?
Collectors.teeing(downstream1, downstream2, merger) was added in Java 12. It applies two collectors to the same stream simultaneously in a single pass, then combines their results using a merger function. This avoids iterating a stream twice for two different aggregations.

// Count and sum in one pass
record Stats(long count, double sum) {}
Stats s = numbers.stream()
    .collect(Collectors.teeing(
        Collectors.counting(),
        Collectors.summingDouble(Double::doubleValue),
        Stats::new));

Use it when you need two aggregation results from a stream that can only be traversed once (e.g., from an I/O source), or when splitting into two streams would be expensive. Use sparingly — nested collectors hurt readability.

What are the data sources with excellent splittability for parallel streams? Which are poor?
?
Excellent (O(1) split):

ArrayList, int[], long[], double[] arrays
IntStream.range(), LongStream.range(), IntStream.rangeClosed()
Arrays.stream(array)

Good:

HashSet, HashMap.keySet(), HashMap.values() — splits via internal bucket boundaries
ConcurrentHashMap segments

Poor:

LinkedList — O(n) to find midpoint
Stream.iterate() — inherently sequential
BufferedReader.lines() — sequential I/O
Stream.generate() — unpredictable elements

The ForkJoin framework that backs parallel streams requires efficient splitting — a poor source degrades to near-sequential performance with extra overhead.

What is the rule of thumb for when parallel streams provide a speedup?
?
The rough rule: N × Q > 10,000, where N = number of elements and Q = cost per element in basic operations. For cheap operations (simple filter/map with arithmetic), you need at least 10,000 elements to justify parallelism overhead. For expensive operations (e.g., each call takes ~1ms), even 1,000 elements may justify parallelism.

Other conditions that must ALL be met:

Data source splits efficiently (ArrayList or array)
Operations are stateless and non-interfering
No shared mutable state
Order doesn’t matter (or use unordered() to hint to the framework)

Always benchmark with JMH before concluding a parallel stream is faster. The JVM’s JIT often surprises you.

Why should you not use parallel streams for I/O-bound work?
?
Parallel streams use the common ForkJoin pool — a shared, fixed-size pool (size = number of CPU cores - 1 by default). Blocking I/O inside a parallel stream lambda starves this pool, degrading performance for all tasks that use it (including other parallel streams and CompletableFuture).

For I/O-bound work, use virtual threads (Java 21): Executors.newVirtualThreadPerTaskExecutor() creates one lightweight virtual thread per task, which can block without consuming a platform thread. Millions of virtual threads can block simultaneously with minimal overhead.

Parallel streams = CPU-bound parallelism. Virtual threads = I/O-bound concurrency. These are complementary tools.

What are the terminal operations in the Stream API? How do they differ from intermediate operations?
?
Terminal operations trigger evaluation of the pipeline and produce a result or side effect. They consume the stream — you cannot reuse a stream after a terminal operation.

Common terminal operations:

collect(Collector) — accumulate into a collection/map/value
forEach(Consumer) / forEachOrdered(Consumer) — consume each element
count() — count elements
findFirst() / findAny() — return Optional of first/any match
anyMatch(Predicate) / allMatch / noneMatch — boolean short-circuit
min(Comparator) / max(Comparator) — return Optional of min/max
reduce(identity, BinaryOperator) — fold elements to a single value
toArray() — collect to array
toList() — collect to unmodifiable list (Java 16+)

Intermediate operations are lazy and return a new Stream: filter, map, flatMap, sorted, distinct, limit, skip, peek, mapToInt/Long/Double.

What is the difference between findFirst() and findAny() in parallel streams?
?

findFirst(): Returns an Optional with the first element in encounter order. In a parallel stream, this requires coordination across threads to determine which element is “first” — it is slower than findAny().
findAny(): Returns an Optional with any element that matches (no ordering guarantee). In a parallel stream, whichever thread finds a match first wins — no cross-thread coordination needed. It is faster.

For sequential streams, both behave identically (return the first match). For parallel streams, prefer findAny() unless encounter order is required. Pair with unordered() for best parallel performance:

Optional<String> match = list.parallelStream()
    .unordered()        // hint: we don't care about order
    .filter(s -> s.startsWith("X"))
    .findAny();         // fastest parallel "any match and return it"

What is Predicate.not() and when was it added?
?
Predicate.not(Predicate<T>) (Java 11) returns the logical negation of the given predicate. It’s primarily useful for negating method references:

// Before Java 11: verbose lambda negation
list.stream().filter(s -> !s.isBlank())
 
// Java 11+: clean negation of a method reference
list.stream().filter(Predicate.not(String::isBlank))
 
// Also useful for:
list.stream().filter(Predicate.not(list2::contains)) // elements not in list2

Without Predicate.not(), you had to write s -> !s.isBlank() — a lambda just to negate a method reference, which is less clean. Predicate.not() fills this gap.

What is the Iterable gap with streams, and what are the workarounds?
?
Stream<T> does NOT implement Iterable<T>, so you cannot use a stream directly in a for-each loop — even though Stream has an iterator() method. This is a known design wart.

// Doesn't compile — Stream is not Iterable
for (ProcessHandle ph : ProcessHandle.allProcesses()) { ... } // ERROR
 
// Workaround 1: Cast via method reference (ugly)
for (ProcessHandle ph : (Iterable<ProcessHandle>) ProcessHandle.allProcesses()::iterator) { ... }
 
// Workaround 2: Adapter method
public static <E> Iterable<E> iterableOf(Stream<E> stream) { return stream::iterator; }
for (ProcessHandle ph : iterableOf(ProcessHandle.allProcesses())) { ... }
 
// Best solution: if you control the API, return Collection or List instead of Stream

The Iterable gap still exists in Java 17+. Item 47 recommends returning Collection from APIs to avoid forcing callers to deal with this.

What should APIs return for sequences — Stream, Iterable, or Collection?
?
Return Collection (or a subtype like List or Set) for finite sequences that fit in memory. Collection implements both Iterable (for-each loops) and provides stream() — callers get both APIs with no compromise.

Return Stream only for inherently lazy, computed-on-demand, or potentially infinite sequences (e.g., Stream.iterate() of primes). Document that the return is a stream and that callers must manage it as such.

Never return Iterable — it’s a subset of Collection with no advantage, and callers who want a stream need StreamSupport.stream(iterable.spliterator(), false) — ugly boilerplate.

The guiding question: “Can callers naturally want both a for-each loop and stream operations?” If yes, return Collection.

What is Collectors.joining() and how does it work?
?
Collectors.joining() concatenates CharSequence stream elements into a String using an internal StringBuilder (O(n) performance). Three variants:

// No separator
String s1 = words.stream().collect(Collectors.joining());           // "helloworld"
 
// With delimiter
String s2 = words.stream().collect(Collectors.joining(", "));       // "hello, world"
 
// With delimiter, prefix, and suffix
String s3 = words.stream().collect(Collectors.joining(", ", "[", "]")); // "[hello, world]"

Always prefer joining over manual string concatenation with reduce — reduce with string concatenation is O(n²) because each concatenation creates a new String. joining uses a StringBuilder internally and is O(n).

What is Function.compose() vs. Function.andThen()?
?
Both combine two Function objects, but in opposite order:

f.andThen(g): Apply f first, then g. Equivalent to g(f(x)).
f.compose(g): Apply g first, then f. Equivalent to f(g(x)).

Function<String, String> trim    = String::trim;
Function<String, String> toUpper = String::toUpperCase;
 
// andThen: trim first, then uppercase
Function<String, String> normalizeAndUpper = trim.andThen(toUpper);
normalizeAndUpper.apply("  hello  "); // "HELLO"
 
// compose: uppercase first, then trim (unusual but valid)
Function<String, String> upperThenTrim = trim.compose(toUpper);
upperThenTrim.apply("  hello  "); // "  HELLO  " trimmed → "HELLO"

andThen is more natural for left-to-right pipeline reading. Consumer.andThen() works the same way.

What is reduce() in streams and what makes it safe for parallel execution?
?
reduce(identity, BinaryOperator<T>) folds all elements into a single value by repeatedly applying the operator:

// Sum all integers
int sum = numbers.stream().reduce(0, Integer::sum);
// Process: 0+1=1, 1+2=3, 3+3=6 (sequential)
// Parallel: split [1,2,3,4] → [1,2] and [3,4] → 3 and 7 → 10 (correct!)

For reduce to be safe for parallel execution, the operator must be:

Associative: (a op b) op c == a op (b op c) — so any split order gives the same result
Identity-compatible: identity op x == x for all x — so an empty sub-stream contributes correctly

Integer::sum satisfies both. String concatenation satisfies associativity but not identity (unless identity is ""). Never use reduce for mutable accumulation — use collect instead.

What is IntStream.range() vs. IntStream.rangeClosed()?
?
Both generate a sequential IntStream of integers, but the end boundary differs:

IntStream.range(0, 10) — generates 0, 1, 2, …, 9 (exclusive end, like a for loop i < 10)
IntStream.rangeClosed(0, 10) — generates 0, 1, 2, …, 10 (inclusive end, like a for loop i <= 10)

// Count elements from 1 to 100 inclusive
long count = IntStream.rangeClosed(1, 100).count(); // 100
 
// Array index iteration (exclusive)
IntStream.range(0, array.length).forEach(i -> process(array[i]));

Both have excellent splittability (O(1) arithmetic split) — ideal as parallel stream sources for CPU-bound work.

What is the difference between Stream.of(...) and Arrays.stream(array)?
?

Stream.of(T... values) creates a stream from varargs — it always produces a Stream<T>. For primitive arrays, Stream.of(int[]) produces a Stream<int[]> (one element — the array itself), NOT an IntStream.
Arrays.stream(int[] array) correctly produces an IntStream from a primitive int array, avoiding boxing.

int[] arr = {1, 2, 3};
 
// Wrong: Stream<int[]> — one element (the whole array)
Stream.of(arr).count(); // 1
 
// Correct: IntStream of 3 elements
Arrays.stream(arr).sum(); // 6
 
// String arrays — either works
Stream.of("a", "b", "c")           // Stream<String>, 3 elements
Arrays.stream(new String[]{"a","b"}) // Stream<String>, 2 elements

For primitive arrays, always use Arrays.stream(primitiveArray) to get the primitive stream type and avoid boxing.

What does Collectors.partitioningBy() return, and how is it different from groupingBy()?
?
Collectors.partitioningBy(Predicate<T>) always returns a Map<Boolean, List<T>> with exactly two keys: true and false. It is a specialized form of groupingBy optimized for boolean partitioning.

// Partition into passing and failing students
Map<Boolean, List<Student>> partition =
    students.stream()
            .collect(Collectors.partitioningBy(s -> s.getGrade() >= 60));
 
List<Student> passing = partition.get(true);
List<Student> failing = partition.get(false);
 
// With downstream collector
Map<Boolean, Long> counts =
    students.stream()
            .collect(Collectors.partitioningBy(s -> s.getGrade() >= 60, Collectors.counting()));

Difference from groupingBy: partitioningBy produces exactly 2 groups (Boolean keys); groupingBy produces as many groups as there are distinct classifier values. partitioningBy is slightly more efficient because it knows the key set in advance.

What does peek() do in a stream pipeline? When is it appropriate to use?
?
peek(Consumer<T>) is an intermediate operation that applies a consumer to each element as it passes through the pipeline, then passes the element unchanged to the next stage. It is primarily for debugging — inspecting elements at intermediate stages without affecting the pipeline.

List<String> result = words.stream()
    .filter(s -> s.length() > 3)
    .peek(s -> System.out.println("After filter: " + s))  // debug
    .map(String::toUpperCase)
    .peek(s -> System.out.println("After map: " + s))      // debug
    .collect(Collectors.toList());

WARNING: Because streams are lazy, peek only fires when a terminal operation actually pulls elements. And in parallel streams, the order of peek calls is non-deterministic. Use peek for development/debugging only — remove it before production. Never use peek for side effects that affect correctness (use forEach at the end instead).

What is the difference between stateful and stateless stream operations? Why does it matter for parallelism?
?

Stateless operations: The result for each element depends only on that element, not on any other element or shared state. Examples: filter, map, flatMap. These parallelize perfectly — each element can be processed independently.
Stateful operations: The result depends on other elements or maintains state across elements. Examples: distinct, sorted, limit, skip. These require coordination across elements (and across threads in parallel), potentially requiring buffering or sorting.
For parallel streams:
Stateless operations scale linearly with CPU cores
Stateful operations like sorted() may require collecting all elements before processing, negating parallelism benefits
limit(n) and skip(n) with ordered parallel streams are expensive — use unordered() if possible
Side-effecting lambdas that mutate shared state introduce race conditions — the parallel model assumes statelessness

Total Cards: 32
Review Time: ~28 minutes
Priority: HIGH
Last Updated: 2026-05-10

Study Notes by Niladri & AI

Explorer

ch06-flashcards

Chapter 6 Flashcards — Lambdas and Streams

Graph View