A major use case of MapReduce comprises performing simple aggregation operations on data, such as min, max, average, and sum. Both of our previous examples were pretty much exactly that. As we can see from the previous code, the reducers and combiners are pretty generic, with a fair amount of code repetition. To avoid this, Hazelcast provides us with a more simplified interface to perform aggregations that sit on top of MapReduce and is exposed via the distributed collections, hiding a fair amount of the internal complexity and dealing with a few performance optimizations for us.
It features two alternative classes to the ones that we previously encountered—a Supplier
object to obtain and provide data and an Aggregation
function to perform, well, the aggregation! In both cases, a number of helpers are provided to reduce the need to write too much code of our own. In the case of the data supplier, if we need to filter or transform the source data before it is...