Scaling Apache Solr

Book Image

Scaling Apache Solr

By : Hrishikesh Vijay Karambelkar

Book Image

Scaling Apache Solr

By: Hrishikesh Vijay Karambelkar

Overview of this book

Scaling Apache Solr

Scaling Apache Solr

Credits

About the Author

About the Author

About the Reviewers

About the Reviewers

www.PacktPub.com

www.PacktPub.com

Preface

Free Chapter

Understanding Apache Solr

Understanding Apache Solr

Challenges in enterprise search

Apache Solr – an overview

Features of Apache Solr

Apache Solr architecture

Practical use cases for Apache Solr

Getting Started with Apache Solr

Getting Started with Apache Solr

Setting up Apache Solr

Understanding the Solr structure

Configuring the Apache Solr for enterprise

Understanding SolrJ

Analyzing Data with Apache Solr

Analyzing Data with Apache Solr

Understanding enterprise data

Loading data using native handlers

Working with rich documents

Importing structured data from the database

Advanced topics with Solr

Designing Enterprise Search

Designing Enterprise Search

Designing aspects for enterprise search

Enterprise search data-processing patterns

Data integrating pattern for search

Case study – designing an enterprise knowledge repository search for software IT services

Integrating Apache Solr

Integrating Apache Solr

Empowering the Java Enterprise application with Solr search

Integration with client technologies

Case study – Apache Solr and Drupal

Distributed Search Using Apache Solr

Distributed Search Using Apache Solr

Need for distributed search

Understanding SolrCloud

Building enterprise distributed search using SolrCloud

Common problems and resolutions

Case study – distributed enterprise search server for the software industry

Scaling Solr through Sharding, Fault Tolerance, and Integration

Scaling Solr through Sharding, Fault Tolerance, and Integration

Enabling search result clustering with Carrot2

Sharding and fault tolerance

Searching Solr documents in near real time

Solr with MongoDB

Scaling Solr through Storm

Scaling Solr through High Performance

Scaling Solr through High Performance

Monitoring performance of Apache Solr

Tuning Solr JVM and container

Optimizing Solr schema and indexing

Speeding Solr through Solr caching

Improving runtime search for Solr

Optimizing SolrCloud

Solr and Cloud Computing

Solr and Cloud Computing

Enterprise search on Cloud

Solr on Cloud strategies

Running Solr on Cloud (IaaS and PaaS)

Running Solr on Cloud (SaaS) and enterprise search as a service

Scaling Solr Capabilities with Big Data

Scaling Solr Capabilities with Big Data

Apache Solr and HDFS

Big Data search on Katta

Using the Solr 1045 patch – map-side indexing

Using the Solr 1301 patch – reduce-side indexing

Apache Solr and Cassandra

Advanced analytics with Solr

Sample Configuration for Apache Solr

Sample Configuration for Apache Solr

Index

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Apache Solr and HDFS

Apache Solr can utilize HDFS to index and store its indices on the Hadoop system. It does not utilize MapReduce-based framework for indexing. The following diagram shows the interaction pattern between Solr and HDFS. You can read more details about Apache Hadoop at http://hadoop.apache.org/docs/r2.4.0/.

Let's understand how this can be done:

To start with, the first and most important task is getting Apache Hadoop set up on your machine (proxy node configuration) or setting up a Hadoop cluster. You can download the latest Hadoop tarball or ZIP from http://hadoop.apache.org. The newer generation Hadoop uses advanced MapReduce (also known as yarn).
Based on the requirement, you can set up a single node (http://hadoop.apache.org/docs/r<version>/hadoop-project-dist/hadoop-common/SingleCluster.html) or a cluster setup (http://hadoop.apache.org/docs/r<version>/hadoop-project-dist/hadoop-common/ClusterSetup.html).
Typically, you will be required to set up the Hadoop...