Book Image

Scaling Big Data with Hadoop and Solr, Second Edition

By : Hrishikesh Vijay Karambelkar
Book Image

Scaling Big Data with Hadoop and Solr, Second Edition

By: Hrishikesh Vijay Karambelkar

Overview of this book

Table of Contents (13 chapters)
Scaling Big Data with Hadoop and Solr Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Understanding a distributed search


The decision to move to a distributed search from a standalone system should be driven by the needs of enterprises, because distributed search applications are not always efficient in terms of performance. In this section, we will focus on understanding distributed search patterns and how Apache Solr supports distributed search. There have been efforts made to enable Apache Solr work with Apache Hadoop platform in the past, and we will look at more details in coming chapters.

Distributed search patterns

There are two important functions of any enterprise search: creation of indexes and run-time searching on indexes. Any or either of these functions can run in distributed mode, depending upon the requirements from an enterprise. To utilize the distributed search, the indexing must be split into multiple shards and should be kept across multiple nodes of a distributed system.

Tip

Sharding is a process of breaking one index into multiple logical units called ...