Book Image

HDInsight Essentials - Second Edition

By : Rajesh Nadipalli
Book Image

HDInsight Essentials - Second Edition

By: Rajesh Nadipalli

Overview of this book

Table of Contents (16 chapters)
HDInsight Essentials Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

HBase


HBase is an open source NoSQL database built on Hadoop that provides random, real-time read/write access to Data Lake. HBase is modeled after Google's Bigtable project where data is organized in column-oriented format. The following are the key features of HBase:

  • Linear scalability: HBase leverages the cluster and hence is scalable like Hadoop

  • Strictly consistent read and writes: HBase is optimized for read performance. For writes, HBase seeks to maintain consistency

  • Automatic and configurable sharding: HBase uses row keys to guide data sharding and distribute data throughout the cluster

  • Automatic recovery on failure: HBase automatically recovers when a node fails and reassigns the region server that was handling the data to another node

  • Low latency queries: HBase provides random and real-time access to data by utilizing memory, bloom filters, and efficient storage mechanisms

HBase positioning in Data Lake and use cases

Let's first understand where HBase fits in the overall Data Lake...