Book Image

Mastering Apache Cassandra 3.x - Third Edition

By : Aaron Ploetz, Tejaswi Malepati, Nishant Neeraj
Book Image

Mastering Apache Cassandra 3.x - Third Edition

By: Aaron Ploetz, Tejaswi Malepati, Nishant Neeraj

Overview of this book

With ever-increasing rates of data creation, the demand for storing data fast and reliably becomes a need. Apache Cassandra is the perfect choice for building fault-tolerant and scalable databases. Mastering Apache Cassandra 3.x teaches you how to build and architect your clusters, configure and work with your nodes, and program in a high-throughput environment, helping you understand the power of Cassandra as per the new features. Once you’ve covered a brief recap of the basics, you’ll move on to deploying and monitoring a production setup and optimizing and integrating it with other software. You’ll work with the advanced features of CQL and the new storage engine in order to understand how they function on the server-side. You’ll explore the integration and interaction of Cassandra components, followed by discovering features such as token allocation algorithm, CQL3, vnodes, lightweight transactions, and data modelling in detail. Last but not least you will get to grips with Apache Spark. By the end of this book, you’ll be able to analyse big data, and build and manage high-performance databases for your application.
Table of Contents (12 chapters)

Configuration

At this point, you could start your node with no further configuration. However, it is good to get into the habit of checking and adjusting the properties that are indicated as follows.

cassandra.yaml

It is usually a good idea to rename your cluster. Inside the conf/cassandra.yaml file, specify a new cluster_name property, overwriting the default Test Cluster:

cluster_name: 'PermanentWaves'

The num_tokens property default of 256 has proven to be too high for the newer, 3.x versions of Cassandra. Go ahead and set that to 24:

num_tokens: 24

To enable user security, change the authenticator and authorizer properties (from their defaults) to the following values:

authenticator: PasswordAuthenticator
authorizer: CassandraAuthorizer
Cassandra installs with all security disabled by default. Even if you are not concerned with security on your local system, it makes sense to enable it to get used to working with authentication and authorization from a development perspective.

By default, Cassandra will come up bound to localhost or 127.0.0.1. For your own local development machine, this is probably fine. However, if you want to build a multi-node cluster, you will want to bind to your machine's IP address. For this example, I will use 192.168.0.101. To configure the node to bind to this IP, adjust the listen_address and rpc_address properties:

listen_address: 192.168.0.101
rpc_address: 192.168.0.101

If you set listen_address and rpc_address, you'll also need to adjust your seed list (defaults to 127.0.0.1) as well:

seeds: 192.168.0.101

I will also adjust my endpoint_snitch property to use GossipingPropertyFileSnitch:

endpoint_snitch: GossipingPropertyFileSnitch

cassandra-rackdc.properties

In terms of NoSQL databases, Apache Cassandra handles multi-data center awareness better than any other. To configure this, each node must use GossipingPropertyFileSnitch (as previously mentioned in the preceding cassandra.yaml configuration process) and must have its local data center (and rack) settings defined. Therefore, I will set the dc and rack properties in the conf/cassandra-rackdc.properties file:

dc=ClockworkAngels
rack=R40