Book Image

Scaling Apache Solr

By : Hrishikesh Vijay Karambelkar
Book Image

Scaling Apache Solr

By: Hrishikesh Vijay Karambelkar

Overview of this book

Table of Contents (18 chapters)
Scaling Apache Solr
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Apache Solr and Cassandra


Cassandra is one of the most widely used and distributed, fault-tolerant NOSQL database. Cassandra is designed to handle Big Data workloads across multiple nodes without a single point of failure. There are some interesting performance benchmarks published at planet Cassandra (), which places Apache Cassandra as one of the fastest NOSQL database among its competitors in terms of throughput, load, and so on. Apache Cassandra allows schemaless storage of user information in its store called column families pattern. For example, look at the data model for sales information, which is shown as follows:

When this model is transformed for the Cassandra store, it becomes columnar storage. The following image shows how this model would look using Apache Cassandra:

As one can see, the key here is the customer ID, and value is a set of attributes/columns, which vary for each row key. Further, columns can be compressed, so reduce the size of your data footprint. The column compression...