Book Image

Scaling Apache Solr

By : Hrishikesh Vijay Karambelkar
Book Image

Scaling Apache Solr

By: Hrishikesh Vijay Karambelkar

Overview of this book

Table of Contents (18 chapters)
Scaling Apache Solr
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Using the Solr 1301 patch – reduce-side indexing


The Solr 1301 patch is again responsible for generating an index using the Apache Hadoop MapReduce framework. This patch is merged in Solr Version 4.7 and is available in the code line if you take Apache Solr with 4.7 and higher versions. This patch is similar to the previous patch (SOLR-1045), but the difference is that the indexes that are generated using Solr 1301 are in the reduce phase and not in the map phase of Apache Hadoop's MapReduce. Once the indexes are generated, they can be loaded on Solr and SolrCloud for further processing and application searching. The following diagram depicts the overall flow:

In case of Solr 1301, a map task is responsible for converting input records to the pair of <key, value>; later, they are passed to the reducer. The reducer is responsible for converting and publishing SolrInputDocument, which is then transformed into Solr indexes. The indexes are then persisted on HDFS directly, which can later...