Book Image

Apache Cassandra Essentials

By : Nitin Padalia
Book Image

Apache Cassandra Essentials

By: Nitin Padalia

Overview of this book

Apache Cassandra Essentials takes you step-by-step from from the basics of installation to advanced installation options and database design techniques. It gives you all the information you need to effectively design a well distributed and high performance database. You’ll get to know about the steps that are performed by a Cassandra node when you execute a read/write query, which is essential to properly maintain of a Cassandra cluster and to debug any issues. Next, you’ll discover how to integrate a Cassandra driver in your applications and perform read/write operations. Finally, you’ll learn about the various tools provided by Cassandra for serviceability aspects such as logging, metrics, backup, and recovery.
Table of Contents (14 chapters)
Apache Cassandra Essentials
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Data distribution


One of the key features of Cassandra is auto-sharding. Data is distributed among nodes in a cluster based on partition keys automatically. A partition key is a column or multiple columns, which are part of a primary key of a column family. Data is distributed based on the tokenized value calculated over the partition key. A partitioner determines how distribution tokens are calculated. Each node of Cassandra cluster its owns a range of tokens. A row is stored on the node that owns the respective token of the row's partition key.

A partitioner can be set using the configuration option partitioner in cassandra.yaml. The new cluster should go with Murmur3Partitioner, as it is a faster partitioner than older ones and also distributes data more efficiently. Other partitioners for backward compatibility are RandomPartitioner, ByteOrderedPartitioner, and OrderPreservingPartitioner.

Here is a brief description of all the listed partitioners:

  • Murmur3Partitioner: This is the default...