Book Image

Cassandra Design Patterns - Second Edition

By : Rajanarayanan Thottuvaikkatumana
Book Image

Cassandra Design Patterns - Second Edition

By: Rajanarayanan Thottuvaikkatumana

Overview of this book

If you are new to Cassandra but well-versed in RDBMS modeling and design, then it is natural to model data in the same way in Cassandra, resulting in poorly performing applications and losing the real purpose of Cassandra. If you want to learn to make the most of Cassandra, this book is for you. This book starts with strategies to integrate Cassandra with other legacy data stores and progresses to the ways in which a migration from RDBMS to Cassandra can be accomplished. The journey continues with ideas to migrate data from cache solutions to Cassandra. With this, the stage is set and the book moves on to some of the most commonly seen problems in applications when dealing with consistency, availability, and partition tolerance guarantees. Cassandra is exceptionally good at dealing with temporal data and patterns such as the time-series pattern and log pattern, which are covered next. Many NoSQL data stores fail miserably when a huge amount of data is read for analytical purposes, but Cassandra is different in this regard. Keeping analytical needs in mind, you’ll walk through different and interesting design patterns. No theoretical discussions are complete without a good set of use cases to which the knowledge gained can be applied, so the book concludes with a set of use cases you can apply the patterns you’ve learned.
Table of Contents (15 chapters)

Processing big data


To perform data analysis, the first and foremost thing needed is to process the data and transform it for the required analytical needs. In other words, the goal here is to do the analysis, and to achieve this, data processing and data transformation are the means. So, the focus here will be on the data processing and data transformations. The tools and technologies discussed here will revolve around data processing and data transformations. Even though analysis is the end goal, focusing on the data processing and data transformation aspects, the end goal will be achieved.

Over the last decade, many technologies have arrived in the market that process large scale data. These have many things in common. They are open source, they run on commodity hardware, they support clustering inherently, and they are backed by reputed companies to make the technology production-ready at scale.

Apache Bigtop is an Apache Foundation project that helps the infrastructure engineers with...