Book Image

Seven NoSQL Databases in a Week

By : Sudarshan Kadambi, Xun (Brian) Wu
Book Image

Seven NoSQL Databases in a Week

By: Sudarshan Kadambi, Xun (Brian) Wu

Overview of this book

This is the golden age of open source NoSQL databases. With enterprises having to work with large amounts of unstructured data and moving away from expensive monolithic architecture, the adoption of NoSQL databases is rapidly increasing. Being familiar with the popular NoSQL databases and knowing how to use them is a must for budding DBAs and developers. This book introduces you to the different types of NoSQL databases and gets you started with seven of the most popular NoSQL databases used by enterprises today. We start off with a brief overview of what NoSQL databases are, followed by an explanation of why and when to use them. The book then covers the seven most popular databases in each of these categories: MongoDB, Amazon DynamoDB, Redis, HBase, Cassandra, In?uxDB, and Neo4j. The book doesn't go into too much detail about each database but teaches you enough to get started with them. By the end of this book, you will have a thorough understanding of the different NoSQL databases and their functionalities, empowering you to select and use the right database according to your needs.
Table of Contents (16 chapters)
Title Page
Copyright and Credits
Dedication
Packt Upsell
Contributors
Preface
Index

System trade-offs


In the introduction to this book, we discussed some of the principles in the design of distributed systems and talked about inherent system trade-offs that we need to choose between while setting out to build a distributed system.

How does HBase make those trade-offs? What aspects of its architecture are affected by these design choices, and what effect does it have on the set of use cases that it might be a fit for?

At this point, we already know HBase range partitions the key space, dividing it into key ranges assigned to different regions. The purpose of the META table is to record the range assignments. This is different from Cassandra, which uses consistent hashing and has no central state store that captures the data placement state.

We already know that HBase is an LSM database, converting random writes into a stream of append operations. This allows it to achieve higher write throughputs than conventional databases, and also makes the layering on top of HDFS possible...