Book Image

Learning Apache Cassandra

By : Matthew Brown
4 (1)
Book Image

Learning Apache Cassandra

4 (1)
By: Matthew Brown

Overview of this book

Table of Contents (19 chapters)
Learning Apache Cassandra
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Partial denormalization


Our initial approach to home timelines, which used the existing, fully normalized data structure that we've already built, is technically viable but will perform very poorly at scale. If I follow F users and want a page of size P for my home timeline, Cassandra will need to do the following:

  • Query F partitions for P rows each

  • Perform an ordered merge of FxP rows in order to retrieve only the most recent P

The most distressing part of this is the fact that both operations grow in complexity proportionally with the number of people I follow. Let's start by trying to fix this.

The basic goal of the home timeline is to show me the most recent status updates that matter to me. Instead of doing all the work to find out what status updates matter to me, based on who I follow, at read time, let's shift some of the work to write time.

I'll create a table that stores references to status updates that I care about. Whenever someone I follow creates a new status update, I'll add...