Book Image

Apache Solr Enterprise Search Server - Third Edition

By : David Smiley, Eric Pugh, Kranti Parisa, Matt Mitchell
Book Image

Apache Solr Enterprise Search Server - Third Edition

By: David Smiley, Eric Pugh, Kranti Parisa, Matt Mitchell

Overview of this book

<p>Solr Apache is a widely popular open source enterprise search server that delivers powerful search and faceted navigation features—features that are elusive with databases. Solr supports complex search criteria, faceting, result highlighting, query-completion, query spell-checking, relevancy tuning, geospatial searches, and much more.</p> <p>This book is a comprehensive resource for just about everything Solr has to offer, and it will take you from first exposure to development and deployment in no time. Even if you wish to use Solr 5, you should find the information to be just as applicable due to Solr's high regard for backward compatibility. The book includes some useful information specific to Solr 5.</p>
Table of Contents (19 chapters)
Apache Solr Enterprise Search Server Third Edition
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Solr and Hadoop


Apache Hadoop and the big data ecosystem have exploded in popularity and most developers are at least loosely familiar with it. Needless to say, there are many pieces of the Hadoop ecosystem that work together to form a big data platform. It's mostly an a-la-carte world in which you combine the pieces you want, each having different uses, or makes different trade-offs between ease-of-coding and performance. What does Solr have to do with Hadoop, you may ask? Read on.

HDFS

As an alternative to a standard filesystem, Solr can store its indexes in Hadoop Distributed File System (HDFS). HDFS acts like a shared filesystem for Solr, somewhat like how networked storage is (for example, a SAN), but is implemented at the application layer instead of at the OS or hardware layer. HDFS offers almost limitless growth, and you can increase storage incrementally without restarting or reconfiguring the server processes supporting it. HDFS has redundancy too, although this is extra-redundant...