Scaling Apache Solr

Book Image

Scaling Apache Solr

By : Hrishikesh Vijay Karambelkar

Book Image

Scaling Apache Solr

By: Hrishikesh Vijay Karambelkar

Overview of this book

Scaling Apache Solr

Scaling Apache Solr

Credits

About the Author

About the Author

About the Reviewers

About the Reviewers

www.PacktPub.com

www.PacktPub.com

Preface

Free Chapter

Understanding Apache Solr

Understanding Apache Solr

Challenges in enterprise search

Apache Solr – an overview

Features of Apache Solr

Apache Solr architecture

Practical use cases for Apache Solr

Getting Started with Apache Solr

Getting Started with Apache Solr

Setting up Apache Solr

Understanding the Solr structure

Configuring the Apache Solr for enterprise

Understanding SolrJ

Analyzing Data with Apache Solr

Analyzing Data with Apache Solr

Understanding enterprise data

Loading data using native handlers

Working with rich documents

Importing structured data from the database

Advanced topics with Solr

Designing Enterprise Search

Designing Enterprise Search

Designing aspects for enterprise search

Enterprise search data-processing patterns

Data integrating pattern for search

Case study – designing an enterprise knowledge repository search for software IT services

Integrating Apache Solr

Integrating Apache Solr

Empowering the Java Enterprise application with Solr search

Integration with client technologies

Case study – Apache Solr and Drupal

Distributed Search Using Apache Solr

Distributed Search Using Apache Solr

Need for distributed search

Understanding SolrCloud

Building enterprise distributed search using SolrCloud

Common problems and resolutions

Case study – distributed enterprise search server for the software industry

Scaling Solr through Sharding, Fault Tolerance, and Integration

Scaling Solr through Sharding, Fault Tolerance, and Integration

Enabling search result clustering with Carrot2

Sharding and fault tolerance

Searching Solr documents in near real time

Solr with MongoDB

Scaling Solr through Storm

Scaling Solr through High Performance

Scaling Solr through High Performance

Monitoring performance of Apache Solr

Tuning Solr JVM and container

Optimizing Solr schema and indexing

Speeding Solr through Solr caching

Improving runtime search for Solr

Optimizing SolrCloud

Solr and Cloud Computing

Solr and Cloud Computing

Enterprise search on Cloud

Solr on Cloud strategies

Running Solr on Cloud (IaaS and PaaS)

Running Solr on Cloud (SaaS) and enterprise search as a service

Scaling Solr Capabilities with Big Data

Scaling Solr Capabilities with Big Data

Apache Solr and HDFS

Big Data search on Katta

Using the Solr 1045 patch – map-side indexing

Using the Solr 1301 patch – reduce-side indexing

Apache Solr and Cassandra

Advanced analytics with Solr

Sample Configuration for Apache Solr

Sample Configuration for Apache Solr

Index

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Chapter 10. Scaling Solr Capabilities with Big Data

In today's world, organizations produce gigabytes of information every day from various applications that are actively utilized by employees for various purposes. The data sources can vary from application software databases, online social media, mobile devices, and system logs to factory-based operational subsystem sensors. With such huge, heterogeneous data, it becomes a challenge for IT teams to process it together and provide data analytics. In addition to this, the size of this information is growing exponentially. With such variety and veracity, using standard data-processing applications to deal with large datasets becomes a challenge and the traditional distributed system cannot handle this Big Data. In this chapter, we intend to look at the problem of handling Big Data using Apache Solr and other distributed systems.

We have already seen some information about NOSQL databases and CAP theorem in Chapter 2, Getting Started with Apache...