Book Image

Learning HBase

By : Shashwat Shriparv
Book Image

Learning HBase

By: Shashwat Shriparv

Overview of this book

Table of Contents (18 chapters)
Learning HBase
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

About the internal storage architecture of HBase


The following figure shows the principle algorithm and data structure HBase works on, that is, LSM-tree, and the way of merging, and precedes the explanation:

HBase stores file using LSM-tree, which maintains data in two separate parts that are optimized for underlying storage. This type of data structure depends on two structures, a current and smaller one in memory and a bigger one on the persistent disk, and once the part in memory becomes bigger than a certain limit, it is merged with the bigger structure that is stored on the disk using a merge sort algorithm and a new in-memory tree is created for newer insert requests. It transforms random data access into sequential data access, which improves read performance, and merging is a background process, which does not affect the foreground processing.