Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Overview of this book

Table of Contents (19 chapters)
Hadoop MapReduce v2 Cookbook Second Edition
Credits
About the Author
Acknowledgments
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Getting started with Apache HBase


HBase is a highly scalable distributed NoSQL data store that supports columnar-style data storage. HBase is modeled after Google's Bigtable. HBase uses HDFS for data storage and allows random access of data, which is not possible in HDFS.

The HBase table data model can be visualized as a very large multi-dimensional sorted map. HBase tables consist of rows, each of which has a unique Row Key, followed by a list of columns. Each row can have any number of columns and doesn't have to adhere to a fixed schema. Each data cell (column in a particular row) can have multiple values based on timestamps, resulting in a three-dimensional table (row, column, timestamp). HBase stores all the rows and columns in a sorted order making it possible to randomly access the data.

Although the data model has some similarities with the relational data model, unlike relational tables, different rows in the HBase data model may have different columns. For instance, the second row...