Every Hadoop node, whether NameNode, DataNode, or Zookeeper is a client node of the Nagios Server. Each node must have the NRPE plugin installed with the checks described under /usr/local/nagios/libexec
and the commands specified under /usr/local/nagios/etc/nrpe.cfg
as shown here:
command[check_balancer]=/usr/local/nagios/libexec/check_hadoop_namenode.pl -H $HOSTADDRESS$ -u $USER8$ -P $PORT$ -b $ARG2$ command[check_zkp]=/usr/local/nagios/libexec/check_zkpd
Similarly, entries need to be made for each check that is executed on the nodes.
In addition to the aforementioned plugins, checks must be in place for hardware, disk, CPU, and memory. You should check the number of processes running on a system by using the check_procs
plugin, check the open ports by using check_tcp.
Make sure that all the nodes have ntp
running and that the time is synced by using check_ntp
. All of these are provided as the standard Nagios system plugins, and they must be placed on each...