Book Image

Learning HBase

By : Shashwat Shriparv
Book Image

Learning HBase

By: Shashwat Shriparv

Overview of this book

Table of Contents (18 chapters)
Learning HBase
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

HBase housekeeping


As data is being added to HBase, it writes an immutable file to store. Each store is made up of column families, and regions consist of these row-key ordered files as it's immutable. So, there will be more files rather than one on the fly. Due to many files, the I/O will be slower, and hence lag in reading and writing, resulting in slower operation. To overcome these types of problems, HBase uses the compaction methodology; let's look into it now. Refer to the following figure for a better understanding:

Compaction

As the name suggests, compaction makes files more compact and, hence, efficient to read up files. When new data is written to HBase, HFile is generated and the number of HFiles might increase the I/O overhead. So, to minimize this, the HFiles are merged to one HFile periodically. As MemStore gets filled, a new HFile is created. If these files are not merged in time, there will be a huge overhead on the system. Compaction is nothing but the merging of two or more...