Book Image

Optimizing Hadoop for MapReduce

By : Khaled Tannir
Book Image

Optimizing Hadoop for MapReduce

By: Khaled Tannir

Overview of this book

Table of Contents (15 chapters)

Summary


In this chapter, we introduced some scenarios and techniques that may help you to identify your cluster's weakness. You learned how to check your Hadoop cluster node's health and how to identify a massive I/O traffic. Also, we talked about how to identify CPU contention using the vmstat Linux tool.

Then we learned some formulas that you need to use in order to size your Hadoop cluster correctly. Also, in the last section, you learned how to configure the number of mappers and reducers correctly using a new, dedicated formula.

In the next chapter, you will learn more about profiling map and reduce tasks, and will dive more deeply in to the universe of Hadoop map and reduce tasks.