Collectors are the aggregation engine of the Stream API. In backend code, they are used for:

grouping records
computing totals and counts
converting lists to maps
building API response structures

groupingBy and Downstream Collectors

Group orders by category:

Map<String, List<Order>> byCategory = orders.stream()
        .collect(Collectors.groupingBy(Order::getCategory));

Revenue by category:

Map<String, BigDecimal> revenueByCategory = orders.stream()
        .filter(o -> o.getStatus() == OrderStatus.COMPLETED)
        .collect(Collectors.groupingBy(
                Order::getCategory,
                Collectors.reducing(BigDecimal.ZERO, Order::getAmount, BigDecimal::add)
        ));

For double-based amounts:

Map<String, Double> revenueByCategory = orders.stream()
        .filter(o -> o.getStatus() == OrderStatus.COMPLETED)
        .collect(Collectors.groupingBy(
                Order::getCategory,
                Collectors.summingDouble(Order::getAmountDouble)
        ));

partitioningBy

partitioningBy creates exactly two buckets.

Map<Boolean, List<Order>> fraudBuckets = orders.stream()
        .collect(Collectors.partitioningBy(Order::isFraudulent));

Great for valid/invalid, active/inactive, paid/unpaid style use cases.

toMap: Handle Duplicate Keys Explicitly

A common production bug is forgetting duplicate key handling.

Bad (throws IllegalStateException on duplicate key):

Map<String, User> byEmail = users.stream()
        .collect(Collectors.toMap(User::getEmail, Function.identity()));

Good:

Map<String, User> byEmail = users.stream()
        .collect(Collectors.toMap(
                User::getEmail,
                Function.identity(),
                (existing, incoming) -> existing
        ));

Always define merge strategy when keys can collide.

Multi-Level Grouping

Revenue by city -> category:

Map<String, Map<String, Double>> revenue = orders.stream()
        .filter(o -> o.getStatus() == OrderStatus.COMPLETED)
        .collect(Collectors.groupingBy(
                Order::getCity,
                Collectors.groupingBy(
                        Order::getCategory,
                        Collectors.summingDouble(Order::getAmountDouble)
                )
        ));

This is where stream collectors significantly outperform manual loop readability.

Real API Example: Dashboard Summary DTO

public class SalesSummary {
    private final Map<String, Double> revenueByCategory;
    private final long completedCount;
    private final long fraudCount;

    public SalesSummary(Map<String, Double> revenueByCategory, long completedCount, long fraudCount) {
        this.revenueByCategory = revenueByCategory;
        this.completedCount = completedCount;
        this.fraudCount = fraudCount;
    }
}

Map<String, Double> revenueByCategory = orders.stream()
        .filter(o -> o.getStatus() == OrderStatus.COMPLETED)
        .collect(Collectors.groupingBy(Order::getCategory, Collectors.summingDouble(Order::getAmountDouble)));

long completedCount = orders.stream().filter(o -> o.getStatus() == OrderStatus.COMPLETED).count();
long fraudCount = orders.stream().filter(Order::isFraudulent).count();

SalesSummary dto = new SalesSummary(revenueByCategory, completedCount, fraudCount);

Custom Collector Example (Top N)

public static Collector<Order, ?, List<Order>> topNByAmount(int n) {
    return Collector.of(
            () -> new PriorityQueue<Order>(Comparator.comparingDouble(Order::getAmountDouble)),
            (pq, order) -> {
                pq.offer(order);
                if (pq.size() > n) pq.poll();
            },
            (left, right) -> {
                right.forEach(o -> {
                    left.offer(o);
                    if (left.size() > n) left.poll();
                });
                return left;
            },
            pq -> {
                List<Order> result = new ArrayList<>(pq);
                result.sort(Comparator.comparingDouble(Order::getAmountDouble).reversed());
                return result;
            }
    );
}

Use custom collectors only when built-ins cannot express your result shape clearly.

`collectingAndThen` for Final DTO Shaping

collectingAndThen is useful when you want post-processing after collection.

Map<String, List<Order>> immutableByCategory = orders.stream()
        .collect(Collectors.collectingAndThen(
                Collectors.groupingBy(Order::getCategory),
                Collections::unmodifiableMap
        ));

This helps enforce immutability on aggregation results passed to other layers.

Null and Key Hygiene

Collectors assume your key/value logic is safe. Before grouping/toMap in production:

normalize keys (trim, toLowerCase) where needed
filter out null keys/values explicitly
define merge behavior for duplicates

Example:

Map<String, User> byEmail = users.stream()
        .filter(u -> u.getEmail() != null)
        .collect(Collectors.toMap(
                u -> u.getEmail().trim().toLowerCase(),
                Function.identity(),
                (a, b) -> a
        ));

Testing Collector Logic

For non-trivial collector pipelines, test:

empty input
duplicate keys
null/invalid records
deterministic totals/counts on known fixture data

Collector bugs are often aggregation-edge bugs, not syntax bugs.

Performance and Readability Rules

use built-in collectors first
avoid very deep nested collector trees in one expression
extract complex downstream collectors to helper methods
for money, prefer BigDecimal
benchmark before parallel collection

Share on

X Facebook LinkedIn Bluesky

Java 8 Collectors — groupingBy, partitioningBy, and Custom Collectors

groupingBy and Downstream Collectors

partitioningBy

toMap: Handle Duplicate Keys Explicitly

Multi-Level Grouping

Real API Example: Dashboard Summary DTO

Custom Collector Example (Top N)

`collectingAndThen` for Final DTO Shaping

Null and Key Hygiene

Testing Collector Logic

Performance and Readability Rules

Share on

Comments

You may also enjoy

Linked List Patterns in Java - Interview Preparation Guide

Designing a Stock Exchange System

Intervals Pattern in Java - Interview Preparation Guide

Heap and Priority Queue Pattern in Java - Interview Preparation Guide

Java 8 Collectors — groupingBy, partitioningBy, and Custom Collectors

groupingBy and Downstream Collectors

partitioningBy

toMap: Handle Duplicate Keys Explicitly

Multi-Level Grouping

Real API Example: Dashboard Summary DTO

Custom Collector Example (Top N)

collectingAndThen for Final DTO Shaping

Null and Key Hygiene

Testing Collector Logic

Performance and Readability Rules

Related Posts

Share on

Comments

You may also enjoy

Linked List Patterns in Java - Interview Preparation Guide

Designing a Stock Exchange System

Intervals Pattern in Java - Interview Preparation Guide

Heap and Priority Queue Pattern in Java - Interview Preparation Guide

`collectingAndThen` for Final DTO Shaping