Book Image

Cassandra High Availability

By : Robbie Strickland
Book Image

Cassandra High Availability

By: Robbie Strickland

Overview of this book

Table of Contents (16 chapters)
Cassandra High Availability
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Deleting data


We have established that Cassandra employs a log-structured storage engine, where all writes are immutable appends to the log. The implication is that data cannot actually be deleted at the time a DELETE statement is issued. Cassandra solves this by writing a marker, called a tombstone, with a timestamp greater than the previous value. This has the effect of overwriting the previous value with an empty one, which will then be compiled in subsequent queries for that column in the same manner as any other update. The actual value of the tombstone is set to the time of deletion, which is used to determine when the tombstone can be removed.

Garbage collection

Eventually these tombstones are reconciled with earlier values as part of the compaction process, where the earlier values are discarded. Refer to Chapter 7, Modeling for High Availability, for more details on how compaction works. There are two possibilities for when data can be physically deleted and tombstones collected.

If...