Book Image

Scaling Apache Solr

By : Hrishikesh Vijay Karambelkar
Book Image

Scaling Apache Solr

By: Hrishikesh Vijay Karambelkar

Overview of this book

Table of Contents (18 chapters)
Scaling Apache Solr
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Optimizing SolrCloud


In any distributed system, if a user has fired a query across multiple nodes, the waiting time will be dependent upon the average performance of the slowest nodes. This concept is called "laggard problem" for indexes of your instance. This problem states that the response to your search query, which is an aggregation of results from all the shards, is controlled by the following formulae:

QueryResponse = avg(max(shardResponseTime))

If you have distributed search in shards, a shard node that has the slowest response time will impact your query response time, and it will start increasing. Similar to the laggard problem, a distributed search also faces limitations. For example, each document uploaded on the distributed Big Data must have a unique key, and that unique key must be stored in the Solr repository, To do that, Solr schema.xml should have stored=true against the key attribute. This unique key has to be unique across all shards. It enables Apache Solr to distribute...