NodeManager is a per-node daemon running on all the slave nodes of the cluster. All the NodeManager nodes are worker nodes that perform application execution. For efficient scheduling, it is important for the ResourceManager to monitor the health of these nodes. Health may include memory, CPU, network usage, and so on. The ResourceManager daemon will not schedule any new application execution requests to an unhealthy NodeManager.
YARN defines a mechanism to monitor health of a node using a script. An administrator needs to define a shell script to monitor the node. If the script returns ERROR
as the first word in any of the output lines, then the ResourceManager marks the node as UNHEALTHY
.
A sample script to check the memory usage of a node is written next. It checks the current memory usage and if the memory usage is greater than 95%, it prints an error message. You need to create a shell script such as check_memory_usage.sh
and change...