Learning Cassandra for Administrators

Learning Cassandra for Administrators

By : Vijay Parthasarathy

Buy this Book

Learning Cassandra for Administrators

By: Vijay Parthasarathy

Buy this Book

Overview of this book

Apache Cassandra is a massively scalable open source NoSQL database. Cassandra is perfect for managing large amounts of structured, semi-structured, and unstructured data across multiple data centers and the cloud. Cassandra delivers linear scalability and performance across many commodity servers with no single point of failure. This book starts by explaining how to derive the solution, basic concepts, and CAP theorem. You will learn how to install and configure a Cassandra cluster as well as tune the cluster for performance. After reading the book, you should be able to understand why the system works in a particular way, and you will also be able to find patterns (and/or use cases) and anti-patterns which would potentially cause performance degradation. Furthermore, the book explains how to configure Hadoop, vnodes, multi-DC clusters, enabling trace, enabling various security features, and querying data from Cassandra. Starting with explaining about the trade-offs, we gradually learn about setting up and configuring high performance clusters. This book will help the administrators understand the system better by understanding various components in Cassandra’s architecture and hence be more productive in operating the cluster. This book talks about the use cases and problems, anti-patterns, and potential practical solutions as opposed to raw techniques. You will learn about kernel and JVM tuning parameters that can be adjusted to get the maximum use out of system resources.

Learning Cassandra for Administrators

Credits

About the Author

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Basic Concepts and Architecture

CAP theorem

BigTable / Log-structured data model

Partitioning and replication Dynamo style

Summary

Installing Cassandra

Memory, CPU, and network requirements

Cassandra in-memory data structures

Downloading/choosing binaries to install

Cassandra on EC2 instance

Create a keyspace

Summary

Inserting Data and Manipulating Data

Querying data

Tracing

Data modeling

Summary

Administration and Large Deployments

Manual repair

Bootstrapping

Monitoring tools

Summary

Performance Tuning

vmstat

iostat

dstat

Summary

Analytics

Hadoop integration

Summary

Security and Troubleshooting

Encryption

Audit

Things to look out for

Summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

iostat

This reports CPU and input/output statistics for devices. The iostat command is used for monitoring system input/output device loading by observing the time the devices are active in relation to their average transfer rates. The first report generated by the iostat command provides statistics concerning the time since the system was booted and hence is mostly useless. It is recommended to run it in a loop (using a parameter; for example, iostats –mx 6 9). On multiprocessor systems, CPU statistics are calculated system-wide as averages, and are shown as follows:

The elements of the preceding command line are elaborated in the following table:

Column	Description
`tps`	Number of transfers (I/O requests) per second for the device.
`Blk_read/s`	Blocks read per second.
`Blk_wrtn/s`	Blocks written per second.
`Blk_read`	Total blocks read.
`Blk_wrtn`	Total blocks written.
`r/s, w/s`	The number of read and write requests that were issued to the device per second.
`await`	The...

Learning Cassandra for Administrators

By : Vijay Parthasarathy

Learning Cassandra for Administrators

By: Vijay Parthasarathy

Overview of this book

Related Content you might be interested in

Current Title:

Learning Cassandra for Administrators

iostat