The YARN (Yet Another Resource Negotiator) is the new MapReduce framework. It is designed to scale for large clusters and performs much better as compared to the old framework. There are new sets of daemons in the new framework, and it is good to understand how they communicate with each other. The following diagram explains the daemons and ports on which they talk:
With a distributed framework of the scale of Hadoop, many things can go wrong. It is not possible to capture all the issues that could occur, but from a monitoring perspective, we can list the things that are common and can be monitored easily. The following table tries to capture the common issues faced in Hadoop:
Issue |
Description and steps that could help |
---|---|
High CPU utilization |
This could be due to high query rate or faulty job. Use |