Book Image

Monitoring Hadoop

By : Aman Singh
Book Image

Monitoring Hadoop

By: Aman Singh

Overview of this book

Table of Contents (14 chapters)

MapReduce overview


MapReduce is the programming model designed to leverage the advantages of a distributed framework in a better way. It is a framework that takes care of various phases a job goes through like initialization, submission, execution, and failure recovery. In addition, there are intermediate stages, such as the map, combiner, shuffle, sort, compression, and reducer stage. Each affects the performance of a job or task and must be monitored for the resource utilization at each stage. Both Hadoop version 1 and version 2 can be monitored using Nagios. In YARN, we have ResourceManager, NodeManager, Application Manager, and few other components, all of which can be monitored using Nagios.

Before going to Nagios checks, there are some important commands and logs, which give us a good idea about the current state of the cluster in terms of the MapReduce operations.

Note

Hadoop natively provides commands to verify the jobs and its related information.