You can improve your overall MapReduce performance using
Combiners. A Combiner is equivalent to a local Reduce operation and can effectively improve the rate of subsequent global Reduce operations. Basically, it is used to preliminarily optimize and minimize the number of key/value pairs that will be transmitted across the network between mappers and reducers. A Combiner will process the intermediate results of the key/value pairs' output using Map operations and it does not impact the transformation logic coded in the map
and reduce
functions.
The standard convention using Combiners is just to repurpose your reducer function as your Combiner. The computing logic should be Commutative (the order in which an operation such as addition is processed has no effect on the final result) and Associative (the order in which we apply the addition operation has no effect on the final result).