Count and Remove Duplicates from a Java Stream

Examples of finding, counting and removing duplicate elements from a Java Stream.

Overview

Java Streams are a lazily processed sequence of elements that supports sequential and parallel operations through a Stream pipeline. A Stream won’t process elements from the source until a terminal operation of the Stream’s pipeline runs.

This tutorial provides quick examples of finding, counting and removing duplicate elements from a Stream of Java objects or custom objects.

Remove Stream Duplicates using distinct()

The Java Stream interface provides several intermediate operations to process and filter elements in a Java Stream. The ‘distinct()‘ method of the Stream deduplicates Java Stream elements and returns a new Stream of the unique elements.

Example of using distinct() to remove Stream duplicates

Stream<String> stream = Stream.of("a", "b", "c", "b", "d", "a", "d"); Stream<String> output = stream.distinct(); output.forEach(System.out::print) //prints: //abcd
Code language: Java (java)

Remove Stream Duplicates using Set

Alternatively, we can use a Java Set to remove duplicates from a Stream. As Java Sets contain unique elements, we can collect our Stream into a Set and create a new Stream with all duplicates removed.

Example of using Java HashSet to remove duplicate elements from a Stream.

Stream<String> stream = Stream.of("a", "b", "c", "b", "d", "a", "d"); Stream<String> output = stream .collect(Collectors.toSet()) .stream(); output.forEach(System.out::print) //prints: //abcd
Code language: Java (java)

Please note that the Java HashSets are unordered collections, which means they won’t preserve the order of the elements.

Remove Duplicates from a Stream of Custom Objects

The distinct() method internally uses the equals() method to check if two elements are equal. To remove duplicates from a Stream of custom objects, our custom class must provide the equality logic.

public class Student { private final Long studentId; private final String firstName; private final String lastName; private final Integer age; @Override public boolean equals(Object other) { if (!(other instanceof Student student2)) { return false; } return student2.studentId.equals(this.studentId); } @Override public int hashCode() { return studentId.hashCode(); } }
Code language: Java (java)

The equals() method in our custom class uses the studentId field to decide if two class instances are equal. Now, we can use the distinct() method on a Stream of the Student objects.

Stream<Student> stream = Stream.of( new Student(1L, "Bob", "Jack", 12), new Student(2L, "Nick", "Stephen", 14), new Student(3L, "Bob", "Holden", 14), new Student(2L, "Nick", "Stephen", 14) ); Stream<Student> stream = getStudentsStream(); Stream<Student> output = stream.distinct(); output.forEach(System.out::print) //prints: //Student(studentId=1, firstName=Bob, lastName=Jack, age=12) //Student(studentId=2, firstName=Nick, lastName=Stephen, age=14) //Student(studentId=3, firstName=Bob, lastName=Holden, age=14)
Code language: Java (java)

Using Stream distinct() by a Particular Field

Sometimes, we cannot modify the equals() method in our custom class, or we want to use a different comparison logic than the one provided by the equals() method.

We can create a wrapper class around our custom object for such cases. The wrapper class will provide our custom comparison logic in the form of its equals() and hashCode() implementations.

Example of using a wrapper class to remove duplicates from a Java Stream based on a specific field or two.

@Getter @RequiredArgsConstructor class StudentWrapper { private final Student student; @Override public boolean equals(Object other) { if (!(other instanceof StudentWrapper wrapper2)) { return false; } return wrapper2.student.getFirstName() .equals(this.student.getFirstName()); } @Override public int hashCode() { return student.getFirstName().hashCode(); } }
Code language: Java (java)

Now, we can map the Stream of our custom object into a Stream of the wrapper class and use the distinct() on it.

Stream<Student> stream = Stream.of( new Student(1L, "Bob", "Jack", 12), new Student(2L, "Nick", "Stephen", 14), new Student(3L, "Bob", "Holden", 14), new Student(2L, "Nick", "Stephen", 14) ); Stream<Student> output = stream .map(StudentWrapper::new) .distinct() .map(StudentWrapper::getStudent); output.forEach(System.out::print) //prints: //Student(studentId=1, firstName=Bob, lastName=Jack, age=12) //Student(studentId=2, firstName=Nick, lastName=Stephen, age=14)
Code language: Java (java)

Count Duplicates in a Stream

We have seen how we can remove duplicates from a Stream using the distinct() method. However, sometimes we may wish to count the duplicates. To do that, we can use the toMap() collector.

Example of counting the duplicates in a Stream

Stream<Integer> stream = Stream.of(22, 31, 22, 34, 25, 31, 34); Map<Integer, Long> map = stream .collect(toMap(Function.identity(), x -> 1L, Long::sum)); map.entrySet().forEach(System.out::println); //prints: //34=2 //22=2 //25=1 //31=2
Code language: Java (java)

Summary

We learned how to use Java Stream’s distinct() method in different scenarios to remove duplicate elements from a Stream. The distinct() method performs an object’s equality check and returns a new Stream containing the unique elements.

We also learned that the equals() method should provide the equality logic to deduplicate a Stream of custom objects. If we want to remove duplicates from a Stream using specific fields not covered by the equals() method, we can use the wrapper class workaround. Lastly, we learned how to count duplicate elements in a Stream using the toMap() collector.

You can refer to our GitHub Repository for the complete source code of the examples used in this tutorial.