Book Image

Optimizing Hadoop for MapReduce

By : Khaled Tannir
Book Image

Optimizing Hadoop for MapReduce

By: Khaled Tannir

Overview of this book

Table of Contents (15 chapters)

Tuning map and reduce parameters


Picking the right amount of tasks for a job can have a huge impact on Hadoop's performance. In Chapter 4, Identifying Resource Weaknesses, you learned how to configure the number of mappers and reducers correctly. But sizing the number of mappers and reducers correctly is not enough to get the maximum performance of a MapReduce job. The optimum occurs when every machine in the cluster has something to do at any given time when a job is executed. Remember that Hadoop framework has more than 180 parameters and most of them should not keep their default settings.

In this section, we will present other techniques to calculate your mappers' and reducers' numbers. It may be more productive to try more than one optimization method, because we aim to find a particular configuration for a given job that uses all available resources on your cluster. The outcome of this change is to enable the user to run as many mappers and reducers in parallel as possible to fully...