Book Image

Mastering Apache Cassandra

By : Nishant Neeraj
Book Image

Mastering Apache Cassandra

By: Nishant Neeraj

Overview of this book

<p>Apache Cassandra is the perfect choice for building fault tolerant and scalable databases. Implementing Cassandra will enable you to take advantage of its features which include replication of data across multiple datacenters with lower latency rates. This book details these features that will guide you towards mastering the art of building high performing databases without compromising on performance.</p> <p>Mastering Apache Cassandra aims to give enough knowledge to enable you to program pragmatically and help you understand the limitations of Cassandra. You will also learn how to deploy a production setup and monitor it, understand what happens under the hood, and how to optimize and integrate it with other software.</p> <p>Mastering Apache Cassandra begins with a discussion on understanding Cassandra’s philosophy and design decisions while helping you understand how you can implement it to resolve business issues and run complex applications simultaneously.</p> <p>You will also get to know about how various components of Cassandra work with each other to give a robust distributed system. The different mechanisms that it provides to solve old problems in new ways are not as twisted as they seem; Cassandra is all about simplicity. Learn how to set up a cluster that can face a tornado of data reads and writes without wincing.</p> <p>If you are a beginner, you can use the examples to help you play around with Cassandra and test the water. If you are at an intermediate level, you may prefer to use this guide to help you dive into the architecture. To a DevOp, this book will help you manage and optimize your infrastructure. To a CTO, this book will help you unleash the power of Cassandra and discover the resources that it requires.</p>
Table of Contents (17 chapters)
Mastering Apache Cassandra
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Installing Cassandra locally


Installing Cassandra in your local machine for experimental or development purposes is as easy as downloading and unzipping the tarball (the .tar compressed file). For development purposes, Cassandra does not have any extreme requirements. Any modern computer with 1 GB of RAM and a dual core processor is good to test the water. Anything higher than that is great. All the examples in this chapter are done on a laptop with 4 GB of RAM, a dual core processor, and the Ubuntu 13.04 operating system. Cassandra is supported on all major platforms; after all, it's Java. Here are the steps to install Cassandra locally:

  1. Install Oracle Java 1.6 (Java 6) or higher. Installing the JVM is sufficient, but you may need the Java Development Kit (JDK) if you are planning to code in Java.

    Note

    Hadoop examples in the later part of this book use Java code.

    # Check if you have Java
    ~$ java -version
    java version "1.7.0_21"
    Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
    Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)

    If you do not have Java, you may want to follow the installation details for your machine from the Oracle Java website (http://www.oracle.com/technetwork/java/javase/downloads/index.html).

  2. Download Cassandra 1.1.x version from the Cassandra website, http://archive.apache.org/dist/cassandra/. This book uses Cassandra 1.1.11, which was the latest at the time of writing this book.

    Note

    By the time you read this book, you might have version 1.2.x or Cassandra 2.0, which have some differences. So, better stick to the 1.1.x version. We will see how to work with later versions and the new stuff that they offer in a later chapter.

    Uncompress this file to a suitable directory.

    # Download Cassandra
    $ wget http://archive.apache.org/dist/cassandra/1.1.11/apache-cassandra-1.1.11-bin.tar.gz
    # Untar to /home/nishant/apps/
    $ tar xvzf apache-cassandra-1.1.11-bin.tar.gz -C /home/nishant/apps/

    The unzipped file location is /home/nishant/apps/apache-cassandra-1.1.11. Let's call this location CASSANDRA_HOME. Wherever we refer to CASSANDRA_HOME in this book, always assume it to be the location where Cassandra is installed.

  3. Configure a directory where Cassandra will store all the data. Edit $CASSANDRA_HOME/conf/cassandra.yaml.

  4. Set a cluster name using the following code:

    cluster_name: 'nishant_sandbox'
  5. Set the data directory using the following code:

    data_file_directories:
        - /home/nishant/apps/data/cassandra/data
  6. Set the commit log directory:

    commitlog_directory: /home/nishant/apps/data/cassandra/commitlog
  7. Set the saved caches directory:

    saved_caches_directory: /home/nishant/apps/data/cassandra/saved_caches
  8. Set the logging location. Edit $CASSANDRA_HOME/conf/log4j-server.properties:

    log4j.appender.R.File=/tmp/cassandra.log

With this, you are ready to start Cassandra. Fire up your shell, and type in $CASSANDRA_HOME/bin/cassandra -f. In this command, -f stands for foreground. You can keep viewing the logs and Ctrl + C to shut the server down. If you want to run it in the background, do not use the -f option. The server is ready when you see Bootstrap/Replace/Move completed! Now serving reads in the startup log as shown:

$ /home/nishant/apps/apache-cassandra-1.1.11/bin/cassandra -f
xss =  -ea -javaagent:/home/nishant/apps/apache-cassandra-1.1.11/bin/../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1024M -Xmx1024M -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss180k
 INFO 20:16:02,297 Logging initialized
[-- snip --]
 INFO 20:16:08,386 Node localhost/127.0.0.1 state jump to normal
 INFO 20:16:08,394 Bootstrap/Replace/Move completed! Now serving reads.