Book Image

Cloudera Administration Handbook

By : Rohit Menon
Book Image

Cloudera Administration Handbook

By: Rohit Menon

Overview of this book

Table of Contents (17 chapters)
Cloudera Administration Handbook
Credits
Notice
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Chapter 4. Exploring HDFS Federation and Its High Availability

You are now ready to set up a Hadoop cluster using CDH5. Once you have a cluster up and running, you are now responsible for managing it and making sure the cluster is available all the time. In this chapter, we will cover some techniques to manage HDFS efficiently and also handle the single point of failure in a Hadoop cluster. In this chapter, we will cover the following topics:

  • Configuring HDFS Federation

  • HDFS high availability using Quorum-based storage and storage using Network File System (NFS)

  • Jobtracker high availability

The heart of HDFS is the namenode. The namenode manages the locations of all data blocks in the cluster. To serve requests faster, the namenode manages all its information in memory. For small clusters, the information stored is lightweight and in most cases, a decent amount of RAM is enough to handle all the information required to maintain a cluster. However, when the number of datanodes increases, hosting...