Apache Solr is an open source, extendible, and enterprise search having effective community development focused on enhancing it every day. Searching has evolved over time, from basic web-crawling documents search to more sophisticated structured/unstructured content search that provides a lot of user interactions. As the data grows, there is a paradigm shift and more focus is towards the effective use of MapReduce or similar distributed technology for handling such a high volume of data. At the same time, the cost of enterprise storage also needs to be controlled.
By design, Apache Lucene and Solr are designed to support large scale implementation. Apache Solr based distributed environment is useful when:
Speeding up the search: If Apache Solr is taking longer time for creation of indexes from raw data or for searching on a keyword across the index store, it is possibly the best candidate to run in a distributed environment.
Index generation time: Incremental generation of indexes...