Book Image

HBase Design Patterns

By : Mark Kerzner, Sujee Maniyam
Book Image

HBase Design Patterns

By: Mark Kerzner, Sujee Maniyam

Overview of this book

<p>With the increasing use of NoSQL in general and HBase in particular, knowing how to build practical applications depends on the application of design patterns. These patterns, distilled from extensive practical experience of multiple demanding projects, guarantee the correctness and scalability of the HBase application. They are also generally applicable to most NoSQL databases.</p> <p>Starting with the basics, this book will show you how to install HBase in different node settings. You will then be introduced to key generation and management and the storage of large files in HBase. Moving on, this book will delve into the principles of using time-based data in HBase, and show you some cases on denormalization of data while working with HBase. Finally, you will learn how to translate the familiar SQL design practices into the NoSQL world. With this concise guide, you will get a better idea of typical storage patterns, application design templates, HBase explorer in multiple scenarios with minimum effort, and reading data from multiple region servers.</p>
Table of Contents (15 chapters)
HBase Design Patterns
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

About the Reviewers

Ricky Ho is a data scientist and programmer, providing advisory and development services in big data analytics, machine learning, and distributed system design projects. He has a wide range of technical interests but is especially passionate about the intersection of machine learning and big data. He has served as the Principal Architect of Microsoft advertising, implementing scalable prediction systems to optimize advertising revenue within a large, web scale deployment. Prior to this, he was a researcher at Adobe's lab where he processed web log data for predictive analytics related research. Before that, he was a distinguished architect in PayPal's risk management team, where he developed a fraud detection system using machine learning and anomaly detection algorithms. Ricky holds 10 patents in distributed computing and cloud resource optimization. He is also an active technical blogger and shares what he learns on his blog at http://horicky.blogspot.com.

Raghu Sakleshpur is a technologist at heart who works in the field of big data, developing and designing solutions specifically in the Hadoop ecosystem. He started off his career in distributed (clustered) systems and transitioned to developing Enterprise Java application (middleware) space, only to return to his true passion of handling big data in both scaled up and scaled out architectures. He is currently working with Intel in the field of big data and spends a good portion of his time working with customers and partners alike to define optimal architectures for specific big data needs.

Sergey Tatarenko is a senior software developer in a major legal e-discovery company in Austin, TX. He received his MSc in Computer Science from Ben-Gurion University of the Negev in Israel and has worked as a software developer since 1999. He started his professional career at Clockwork Solutions, Israel, and worked on a product that was used to build discrete event simulation models. Later, he lead a team of software developers in HyperRoll, but staying farther away from actual software development was not so much fun. In 2008, Sergey agreed to relocate to USA and help his previous employer to finish building their product. In April 2013, he decided to get himself more exposed to big data and started working for a leading legal e-discovery company in Austin, TX. In addition to being a software developer, Sergey is a proud father of three beautiful kids—Ilia, Antony, and Emilia—and a happy husband to his beautiful wife, Ilona. He is also a very active member in the Russian-speaking community of Austin, an enthusiastic builder of Arduino projects at home, and an occasional fisherman.