Book Image

Monitoring Hadoop

By : Aman Singh
Book Image

Monitoring Hadoop

By: Aman Singh

Overview of this book

Table of Contents (14 chapters)

Hive monitoring


In Hadoop, Apache Hive is a data warehousing tool, similar to SQL. It provides a query layer on top of Hadoop, thus easing out the learning curve between the traditional DBAs using SQL and the Hadoop framework.

In Apache Hive, the query language is referred to as HiveQL; it contains Metastore, which can be embedded, implying that it is internal and stored in the default database called derby, or stored externally in an RDMS such as MySQL. External storage is considered a best practice, as it lets multiple users connect to Hive. In the embedded mode, only one user can connect to the Hive prompt.

It is very important to make sure that Hive components such as Metastore or host health are constantly monitored. There are few important things that need to be kept track of in Hive such as the following:

  • Hive Metastore health checks: Irrespective of whether Metastore is local or remote, it is important to monitor the health of Metastore. Important things to keep track of are as follows...