Book Image

Optimizing Hadoop for MapReduce

By : Khaled Tannir
Book Image

Optimizing Hadoop for MapReduce

By: Khaled Tannir

Overview of this book

Table of Contents (15 chapters)

Using Combiners


You can improve your overall MapReduce performance using Combiners. A Combiner is equivalent to a local Reduce operation and can effectively improve the rate of subsequent global Reduce operations. Basically, it is used to preliminarily optimize and minimize the number of key/value pairs that will be transmitted across the network between mappers and reducers. A Combiner will process the intermediate results of the key/value pairs' output using Map operations and it does not impact the transformation logic coded in the map and reduce functions.

The standard convention using Combiners is just to repurpose your reducer function as your Combiner. The computing logic should be Commutative (the order in which an operation such as addition is processed has no effect on the final result) and Associative (the order in which we apply the addition operation has no effect on the final result).

Note

To get more information about Commutative and Associative properties, you can browse...