Book Image

Scaling Apache Solr

By : Hrishikesh Vijay Karambelkar
Book Image

Scaling Apache Solr

By: Hrishikesh Vijay Karambelkar

Overview of this book

Table of Contents (18 chapters)
Scaling Apache Solr
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Using the Solr 1045 patch – map-side indexing


The Apache Solr 1045 patch provides Solr users a way to build Solr indexes using the MapReduce framework of Apache Hadoop. Once created, this index can be pushed to Solr storage. The following diagram depicts the mapper and reducer in Hadoop:

Each Apache Hadoop mapper transforms input records into a set of (key-value) pairs, which then gets transformed into SolrInputDocument. The Mapper task ends up creating an index from SolrInputDocument.

The focus of reducer is to perform de-duplication of different indexes and merge them if needed. Once the indexes are created, you can load them on your Solr instance and use them to search. You can read more about this patch on https://issues.apache.org/jira/browse/SOLR-1045.

The patch follows the standard process of patching up your label through SVN. To apply a patch to your Solr instance, you first need to build your Solr instance using source. The instance should be supported by the Solr 1045 patch. Now...