Book Image

Scaling Apache Solr

By : Hrishikesh Vijay Karambelkar
Book Image

Scaling Apache Solr

By: Hrishikesh Vijay Karambelkar

Overview of this book

Table of Contents (18 chapters)
Scaling Apache Solr
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Sharding and fault tolerance


We have already seen sharding, collection, and replicas in Chapter 6, Distributed Search Using Apache Solr. In this section, we will look at some of the important aspects of sharding and how it plays a role in scalability and high availability. The strategy to create new shards is highly dependent upon the hardware and shard size. Let's say, you have two machines, A and B, of the same configuration, each with one shard. Shard A is loaded with 1 million index documents, and shard B is loaded with 100 documents. When a query is fired, the query response to any Solr query is determined by the query response of the slowest node (in this case, shard A). Hence, a shard with near to equal shard sizes can perform better in this case.

Document routing and sharding

We have seen the leader-selection process in Chapter 6, Distributed Search Using Apache Solr. Typically, when any enterprise search is deployed, the size of documents to be indexed keeps growing over time. As...