We have already seen the sharding, collection and replicas. In this section we will look at some of the important aspects of sharding, and how it plays a role in scalability and high availability. The strategy for creating new shards is highly dependent upon the hardware and the shard size. Let's say, you have two machines M1 & M2, of, the same configuration, each with one shard. Shard A is loaded with 1 million index documents, and shard B is loaded with 100 documents. When a query is fired, the query response to any Solr queries is determined by the query response of slowest node (in this case shard A). Hence having a shard with near to equal shard sizes can perform better in this case.
Scaling Big Data with Hadoop and Solr, Second Edition
By :
Scaling Big Data with Hadoop and Solr, Second Edition
By:
Overview of this book
Table of Contents (13 chapters)
Scaling Big Data with Hadoop and Solr Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Free Chapter
Processing Big Data Using Hadoop and MapReduce
Understanding Apache Solr
Enabling Distributed Search using Apache Solr
Big Data Search Using Hadoop and Its Ecosystem
Scaling Search Performance
Use Cases for Big Data Search
Index
Customer Reviews