One of the most interesting features introduced with Java Streams is the ability of grouping collections easily by using the groupingBy collector.
Although in some cases it could be pretty straightforward, there are different combinations of groupingBy and some of them are not so obvious to understand for many developers, so that’s why I’ve decided that it’d be beneficial to take a deep look at them together.
First of all, let’s start by looking at what the Java API offers to us in Collectors class.
Java offers a combination of collectors to group elements of a Java Stream, let’s list them all before we continue:
Groups elements using a function to assign a key to each element; this key will specify what group each element belongs to.
- groupingBy(classifier, downstream)
Same as previous method but we can transform the elements assigned to each key in many ways.
- groupingBy(classifier, mapFactory, downstream)
Same as previous method but also we can provide how to initialise the Map we use to store our data.
- groupingByConcurrent(classifier, downstream)
- groupingByConcurrent(classifier, mapFactory, downstream)
As you can see, there are three different types of collectors and on top of those we have a “concurrent” version for each of them.
Nothing better than an example to explain each case clearly, let’s start!
Group elements in a Map by key
The first method what it does is to take each element in our collection and assign it to a key using a classifier; our collector will place each of these elements in a Java Map, associating a List to each existing key in the Map.
I know it might sound confusing, so let’s see it with an example. We are going to group our list of employees by sex. Let’s see how would it look like:
This is the simplest of the cases, we just specify a classifier, which is a Function that accepts each element in the collection and returns a key; this key will be the value being used to assign each element to a given group.
Please also notice that the key that we use to classify each element doesn’t need to be a Java simple type, it can be any object that we might need. We could even create our own object to classify the elements based on a combination of fields.
In our example there are only two groups, because the “sex” field has only two values: MALE and FEMALE. So as a result what we get back is a Map<Sex, List<Employee>>.
That’s the default structure that we’ll get when we use groupingBy collector in Java (Map<Key, List<Element>>).
Let’s see now what happens if we need to modify this default structure.
Grouping by key specifying a different collector
Let’s say that instead of storing our results in a List, as our groupingBy collector does by default, we want to store our elements in a Set; for instance, we might want to avoid duplicates.
In that case a different method is provided which accepts a “downstream” collector that we can use to transform our resulting collection for each group as we see fit. In our example this will be really easy:
As you can see, the collections are now stored in a Set instead. What if we want to collect this data differently? We might want to use any kind of aggregation operation; for instance, we might need to calculate the average age for each group. We can use averagingInt collector in Collectors class to achieve that.
Please notice that in this case we don’t assign a collection to each key in the map, just a double as a result of calculating the average age for each group.
What else does Java allow us to do? We could also group by a secondary field! Imagine that we want to group by sex first and then by age, as easy as this:
There are multiple combinations we can use with these collectors, the API that Java provides to us to treat collections easily is very powerful.
Let’s see one more example; what if we want to find the youngest employee in each age group? Very easy as well, as simple as this:
We’re using minBy collector, which accepts a Comparator; Java provides a very easy way to provide comparators now by using Comparator.comparing(keyExtractor). KeyExtractor is a Function, so passing in a method reference is enough in this case.
That’s brilliant, right? Manipulating data using Java Streams using a functional style has made our lives so much easier as developers!
We could even filter some employees based on a condition after they’re grouped; for example, let’s get all the employees over thirty grouped by sex:
We could keep showing examples forever, as there are multiple combinations, but I’d suggest that you use autocomplete in Collectors class to show all the available collectors every time you need to do something related to grouping collections.
Let’s take a look now at the third groupingBy option!
Grouping elements specifying what Map to use
The third option allows us to initialise the Map used to store our groups, most of the times we won’t need it because in most of the cases using a Java standard Map is perfectly valid.
Let’s see an example of how can we use it; in our example, although it will look as a quite silly and useless example, we are going to initialise the Map with an existing Map with a couple of elements:
As you can see, we tell our collector to start with an existing map, which is provided by a supplier we specify in the “mapFactory” argument of our groupingBy method.
What’s interesting about our example is that after running the code you’ll see that our final Map contains employees below thirty! Why is that? That’s because our initial elements provided in the initial map don’t get processed by the filtering collector. This is something important to consider if you ever need to use mapFactory argument.
Grouping elements concurrently
After having had a look at the three main methods provided by groupingBy collectors, we know that there’s a concurrent version for each of them.
To execute these methods we’ll need a parallel stream and it will return a ConcurrentHashMap. This method has the flags CONCURRENT and UNORDERED set, what means that multiple threads will accumulate elements into the same accumulator instead of combining multiple accumulators; in this case the accumulator is a ConcurrentHashMap.
So that’s all about groupingBy! I hope you’ve enjoyed it!
If you need to improve your understanding of Java Streams and functional programming in Java, I’d recommend that you read “Functional Programming in Java: Harnessing the Power Of Java 8 Lambda Expressions”; you can buy it on Amazon in the following link.
Before JDK 8 was released, grouping elements was never even close to being as easy as it is nowadays; if you consider the amount of code and the lack of readability that we would achieve by writing any of the examples we’ve seen in this articles using plain Java 7 code, the difference will be substantial.
Also is much more enjoyable as a developer to write our solutions using a functional approach, just expressing the steps needed and not how to do it. This has been a huge step towards good readability and simplicity for our Java code.
I think we can all agree that now grouping elements in Java is very straightforward in most of the cases, except for some special complex cases that we could find, in general it’s very simple.
So that’s it from me! I really hope you’ve learned something today and that you’ll be able to use our learnings in your day-to-day work as a developer. I’m looking forward to seeing you back soon and please follow me if you like my articles and you would like to get notified when my next article gets published!
Thank you very much for reading!