Hadoop, ZooKeeper, and HBase, all produce logs. These logs include information about normal operations, as well as warning/error output, and internal diagnostic data. It is ideal to have a system gathering and processing all these logs to extract useful insight information of the cluster. A most basic task is to check these logs and get notified if anything abnormal is shown in them. The NRPE and check_log
Nagios plugins can be used to achieve this simple goal, with a few simple steps.
The description from NRPE plugin's homepage (http://exchange.nagios.org/directory/Addons/Monitoring-Agents/NRPE--2D-Nagios-Remote-Plugin-Executor/details) is as follows:
NRPE allows you to remotely execute Nagios plugins on other Linux/Unix machines. This allows you to monitor remote machine metrics (disk usage, CPU load, etc.).
Using NRPE, we can remotely execute the check_log
Nagios plugin on a cluster node to check the Hadoop/HBase logs generated by that node.
The check_log...