Book Image

Monitoring Hadoop

By : Aman Singh
Book Image

Monitoring Hadoop

By: Aman Singh

Overview of this book

Table of Contents (14 chapters)

Monitoring best practices


Until now, we have talked about monitoring and metrics collection for Hadoop components, HBase, Hive, and many more. But, it is very important to understand what should be collected, else we might find it difficult to manage the data collected and extract any meaningful information from it.

It is good to enable logging, but at what level? Are we fine to log every event that is generated? Will that be helpful to us in any way? These are the questions we need to ask ourselves while designing a monitoring and logging system.

Some of the key points to keep in mind while designing a monitoring and metrics collection system are as follows:

  • How easily it can be scaled

  • How easily we can extract information from the system

  • What we should log and collect

  • How long should we keep the data

We cannot log or collect all the metrics; for example, let's say we have a 200-node cluster with HBase region servers. Let's say we collect 20 metrics per region, 500 regions live at a time, and...