Scaling Big Data with Hadoop and Solr, Second Edition

The decision to move to a distributed search from a standalone system should be driven by the needs of enterprises, because distributed search applications are not always efficient in terms of performance. In this section, we will focus on understanding distributed search patterns and how Apache Solr supports distributed search. There have been efforts made to enable Apache Solr work with Apache Hadoop platform in the past, and we will look at more details in coming chapters.

Distributed search patterns

There are two important functions of any enterprise search: creation of indexes and run-time searching on indexes. Any or either of these functions can run in distributed mode, depending upon the requirements from an enterprise. To utilize the distributed search, the indexing must be split into multiple shards and should be kept across multiple nodes of a distributed system.

Tip

Sharding is a process of breaking one index into multiple logical units called ...

Scaling Big Data with Hadoop and Solr, Second Edition

By : Hrishikesh Vijay Karambelkar

Scaling Big Data with Hadoop and Solr, Second Edition

By: Hrishikesh Vijay Karambelkar

Overview of this book

Related Content you might be interested in

Current Title:

Scaling Big Data with Hadoop and Solr, Second Edition

Understanding a distributed search

Distributed search patterns

Tip