MapReduce is the programming model designed to leverage the advantages of a distributed framework in a better way. It is a framework that takes care of various phases a job goes through like initialization, submission, execution, and failure recovery. In addition, there are intermediate stages, such as the map, combiner, shuffle, sort, compression, and reducer stage. Each affects the performance of a job or task and must be monitored for the resource utilization at each stage. Both Hadoop version 1 and version 2 can be monitored using Nagios. In YARN, we have ResourceManager, NodeManager, Application Manager, and few other components, all of which can be monitored using Nagios.
Before going to Nagios checks, there are some important commands and logs, which give us a good idea about the current state of the cluster in terms of the MapReduce operations.