The Hadoop cluster might have many jobs running on it at any given time, making it extremely important to monitor and make sure that it is running perfectly. The Hadoop clusters are multi-tenant clusters, which mean that multiple users with different use cases and data sizes run jobs on it. How do we make sure that each user or job is getting what it is configured for on the cluster?
In this chapter, we will look at the checks related to MapReduce and its related components. The following topics will be covered in this chapter:
MapReduce checks
JobTracker and related health checks
CPU utilization of MapReduce jobs
Memory utilization of MapReduce jobs
YARN component checks
Total cluster capacity in terms of memory and CPU