Book Image

Mastering Apache Cassandra 3.x - Third Edition

By : Aaron Ploetz, Tejaswi Malepati, Nishant Neeraj
Book Image

Mastering Apache Cassandra 3.x - Third Edition

By: Aaron Ploetz, Tejaswi Malepati, Nishant Neeraj

Overview of this book

With ever-increasing rates of data creation, the demand for storing data fast and reliably becomes a need. Apache Cassandra is the perfect choice for building fault-tolerant and scalable databases. Mastering Apache Cassandra 3.x teaches you how to build and architect your clusters, configure and work with your nodes, and program in a high-throughput environment, helping you understand the power of Cassandra as per the new features. Once you’ve covered a brief recap of the basics, you’ll move on to deploying and monitoring a production setup and optimizing and integrating it with other software. You’ll work with the advanced features of CQL and the new storage engine in order to understand how they function on the server-side. You’ll explore the integration and interaction of Cassandra components, followed by discovering features such as token allocation algorithm, CQL3, vnodes, lightweight transactions, and data modelling in detail. Last but not least you will get to grips with Apache Spark. By the end of this book, you’ll be able to analyse big data, and build and manage high-performance databases for your application.
Table of Contents (12 chapters)

Cassandra's read path

The Cassandra read path is somewhat more complex. Similar to the write path, structures in-memory and on-disk structures are examined, and then reconciled:

Figure 2.4: An illustration of the Cassandra read path, illustrating how the different in-memory and on-disk structures work together to satisfy query operations

As shown in the preceding figure, a node handling a read operation will send that request on two different paths. One path checks the memtables (in RAM) for the requested data.

If row-caching is enabled (it is disabled by default), it is checked for the requested row, followed by the bloom filter. The bloom filter (Ploetz, et-al 2018) is a probability-based structure in RAM, which speeds up reads from disk by determining which SSTables are likely to contain the requested data.

If the response from the bloom filter is negative, the partition...