Book Image

HBase Design Patterns

By : Mark Kerzner, Sujee Maniyam
Book Image

HBase Design Patterns

By: Mark Kerzner, Sujee Maniyam

Overview of this book

<p>With the increasing use of NoSQL in general and HBase in particular, knowing how to build practical applications depends on the application of design patterns. These patterns, distilled from extensive practical experience of multiple demanding projects, guarantee the correctness and scalability of the HBase application. They are also generally applicable to most NoSQL databases.</p> <p>Starting with the basics, this book will show you how to install HBase in different node settings. You will then be introduced to key generation and management and the storage of large files in HBase. Moving on, this book will delve into the principles of using time-based data in HBase, and show you some cases on denormalization of data while working with HBase. Finally, you will learn how to translate the familiar SQL design practices into the NoSQL world. With this concise guide, you will get a better idea of typical storage patterns, application design templates, HBase explorer in multiple scenarios with minimum effort, and reading data from multiple region servers.</p>
Table of Contents (15 chapters)
HBase Design Patterns
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Preface

Software plays a paramount role in today's world, and NoSQL databases are an important part of the modern stack. They are found wherever a subsecond response access to vast amounts of information is needed. However, there is a huge gap between the first "Hello World" example in a NoSQL database and creating practical, scalable, and stable applications. The aim of this book is to fill this gap and to give you practical guidelines for building NoSQL software.

The book is specifically formulated in terms of HBase, and there are a few areas of design where HBase might be different from Cassandra or MongoDB, for example, but most of the design patterns discussed here can be transferred to other NoSQL databases. You are expected to invest efforts in learning, which will lead to rewarding skills in the end.

What this book covers

Chapter 1, Starting Out with HBase, covers what HBase is and the various ways in which you can install it on your computer or cluster of computers, with practical advice on the development environment.

Chapter 2, Reading, Writing, and Using SQL, covers the HBase shell and gives the first example of Java code to read and write data in HBase. It also covers using the Phoenix driver for higher-level access, which gives back SQL, justifying the "Not-only-SQL" meaning of NoSQL.

Chapter 3, Using HBase Tables for Single Entities, covers the simplest HBase tables to deal with single entities, such as the table of users. Design patterns in this chapter emphasize on scalability, performance, and planning for special cases, such as restoring forgotten passwords.

Chapter 4, Dealing with Large Files, covers how to store large files in HBase systems. It also covers the alternative ways of storing them and the best practices extracted from solutions for large environments, such as Facebook, Amazon, and Twitter.

Chapter 5, Time Series Data, shows that stock market, human health monitoring, and system monitoring data are all classified as time series data. The design patterns for this organize time-based measurements in groups, resulting in balanced, high-performing HBase tables. Many lessons are learned from OpenTSDB.

Chapter 6, Denormalization Use Cases, discusses one of the most common design patterns for NoSQL denormalization, where the data is duplicated in more than one table, resulting in huge performance benefits. It also shows when to unlearn one's SQL normalization rules and how to apply denormalization wisely.

Chapter 7, Advanced Patterns for Data Modeling, shows you how to implement a many-to-many relationship in HBase that deals with transactions using compound keys.

Chapter 8, Performance Optimization, covers bulk loading for the initial data load into HBase, profiling HBase applications, benchmarking, and load testing.

What you need for this book

All the software used in this book is open source and free. You need Linux and Internet access. The book teaches you how to download and install the rest.

Who this book is for

If you deal with implementing practical big data solutions, involving quick access to massive amounts of data, this is the book for you. Primarily intended for software developers and architects, it can also be used by project managers, investors, and entrepreneurs who plan software implementations.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "The key is what you save when EC2 created the key pair for you, and <cm-url> is the URL of the server where you run the Cloudera Manager."

A block of code is set as follows:

private void generate(int nUsers, int nEmails) throws IOException {
        SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
        new File(Util.ROOT_DIR).mkdirs();
        Charset charset = Charset.forName("US-ASCII");
        if (nEmails < 1) {
            nEmails = 1;
        }

Any command-line input or output is written as follows:

./sqlline.sh localhost $HBASE_BOOK_HOME/generated/users.txt

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Also, you can see how Start Key and End Key, we specified, are showing up."

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail , and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at , and we will do our best to address the problem.