Book Image

Monitoring Hadoop

By : Aman Singh
Book Image

Monitoring Hadoop

By: Aman Singh

Overview of this book

Table of Contents (14 chapters)

Hive metrics


Apache Hive provides very basic metrics for JVM profiling, which could be handy from the monitoring and performance aspects.

It makes sense to enable JMX when running the Hive thrift server by using the following code snippet:

JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8008"

With the thrift server, it actually executes hadoop jar and passes the option to JVM; $HIVE_OPTS must be set in the hive-env.sh file.

The Java package called org.apache.hadoop.hive.common.metrics can be tapped for Hive metrics collection.

HBase monitoring

HBase is a NoSQL database designed to work very well on a distributed framework such as Hadoop. It has the concept of master and slave servers (region servers) much like the Hadoop architecture. Being a database and holding large amounts of data makes its state consistent and performance optimal.

Knowing what's happening at a given time...