Book Image

Mastering Apache Cassandra - Second Edition

Book Image

Mastering Apache Cassandra - Second Edition

Overview of this book

Table of Contents (15 chapters)
Mastering Apache Cassandra Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Backup and restoration


Cassandra provides a simple backup tool called nodetool snapshot to take incremental snapshots and back up of data. The snapshot command flushes MemTables to the disk and creates a backup by creating a hard link to SSTables (SSTables are immutable).

Note

Hard link is a directory entry associated with file data on a filesystem. It can roughly be assumed as an alias to a file that refers to the location where data is stored. It is unlike a soft link that just aliases filenames, not the actual underlying data.

These hard links stay under the data directory, which is placed under <keyspace>/<column_family>/snapshots.

The general plan to back up a cluster roughly follows these steps:

  1. Take a snapshot of each node one by one. The snapshot command provides an option to specify whether to back up the entire keyspace or just the selected column families.

  2. Taking a snapshot is just half of the story. To be able to restore the database at a later point, you need to move these...