Book Image

Monitoring Hadoop

By : Aman Singh
Book Image

Monitoring Hadoop

By: Aman Singh

Overview of this book

Table of Contents (14 chapters)

Hadoop Ganglia integration


Ganglia is a metrics collection and a visualization tool for the enterprise and works very well with Nagios and Hadoop. In addition to just collecting stats about CPU, memory, and disk, other finely tuned metrics are required, which can be provided by this framework.

Until now, we have seen that the metrics collection can be done to a file or to any other tool like Splunk, depending upon the class interface. We can configure which class handles the metrics update.

For Ganglia, we use GangliaContext, which is an implementation of MetricsContext. Ganglia versions higher than 3.0 provide this integration and work very well for collecting the Hadoop metrics.

In Ganglia, the metrics can be collected for NameNode, JobTracker, MapReduce tasks, JVM, RPC, DataNodes, and the new YARN framework.

Hadoop metrics configuration for Ganglia

Firstly, we need to define a sink class, as per Ganglia version 3.1:

*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31

Secondly...