As HBase runs within HDFS, in addition to taking care of the HBase cluster, it is also important to keep your HDFS running on a healthy status. NameNode is the most important component in an HDFS cluster. A NameNode crash makes the entire HDFS cluster inaccessible. The metadata of an HDFS cluster, including the filesystem image and edit log, is managed by NameNode.
We need to protect our NameNode metadata for two situations:
Metadata lost in the event of a crash
Metadata corruption by any reason
For the first situation, we can set up NameNode to write its metadata to its local disk, along with an NFS mount. As described in the Setting up multiple, highly available (HA) masters recipe, in Chapter 1, Setting Up HBase Cluster, we can even set up multiple NameNode nodes to achieve high availability.
Our solution for the second situation, is to back up the metadata frequently so that we can restore the NameNode state in case of metadata corruption.
We will describe...