Book Image

Optimizing Hadoop for MapReduce

By : Khaled Tannir
Book Image

Optimizing Hadoop for MapReduce

By: Khaled Tannir

Overview of this book

Table of Contents (15 chapters)

Chapter 4. Identifying Resource Weaknesses

Every Hadoop cluster consists of different machines and different hardware. This means that each Hadoop installation should be optimized for its unique cluster setup. To ensure that your Hadoop is performing jobs efficiently, you need to check your cluster and identify potential bottlenecks in order to eliminate them.

This chapter presents some scenarios and techniques to identify cluster weaknesses. We will then introduce some formulas that will help to determine an optimal configuration for NameNodes and DataNodes. After that, you will learn how to configure your cluster correctly and how to determine the number of mappers and reducers for your cluster.

In this chapter, you will learn the following:

  • To check the cluster's weakness based on some scenarios

  • To identify CPU contention and inappropriate number of mappers and reducers

  • To identify massive I/O and network traffic

  • To size your cluster and define its sizing

  • To configure your cluster correctly