Book Image

Scaling Apache Solr

By : Hrishikesh Vijay Karambelkar
Book Image

Scaling Apache Solr

By: Hrishikesh Vijay Karambelkar

Overview of this book

Table of Contents (18 chapters)
Scaling Apache Solr
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Apache Solr and HDFS


Apache Solr can utilize HDFS to index and store its indices on the Hadoop system. It does not utilize MapReduce-based framework for indexing. The following diagram shows the interaction pattern between Solr and HDFS. You can read more details about Apache Hadoop at http://hadoop.apache.org/docs/r2.4.0/.

Let's understand how this can be done:

  1. To start with, the first and most important task is getting Apache Hadoop set up on your machine (proxy node configuration) or setting up a Hadoop cluster. You can download the latest Hadoop tarball or ZIP from http://hadoop.apache.org. The newer generation Hadoop uses advanced MapReduce (also known as yarn).

  2. Based on the requirement, you can set up a single node (http://hadoop.apache.org/docs/r<version>/hadoop-project-dist/hadoop-common/SingleCluster.html) or a cluster setup (http://hadoop.apache.org/docs/r<version>/hadoop-project-dist/hadoop-common/ClusterSetup.html).

  3. Typically, you will be required to set up the Hadoop...