Solr 1.4 Enterprise Search Server

Solr 1.4 Enterprise Search Server

By : David Smiley, Eric Pugh

Buy this Book

Solr 1.4 Enterprise Search Server

By: David Smiley, Eric Pugh

Buy this Book

Overview of this book

If you are a developer building a high-traffic web site, you need to have a terrific search engine. Sites like Netflix.com and Zappos.com employ Solr, an open source enterprise search server, which uses and extends the Lucene search library. This is the first book in the market on Solr and it will show you how to optimize your web site for high volume web traffic with full-text search capabilities along with loads of customization options. So, let your users gain a terrific search experience. This book is a comprehensive reference guide for every feature Solr has to offer. It serves the reader right from initiation to development to deployment. It also comes with complete running examples to demonstrate its use and show how to integrate it with other languages and frameworks. This book first gives you a quick overview of Solr, and then gradually takes you from basic to advanced features that enhance your search. It starts off by discussing Solr and helping you understand how it fits into your architecture—where all databases and document/web crawlers fall short, and Solr shines. The main part of the book is a thorough exploration of nearly every feature that Solr offers. To keep this interesting and realistic, we use a large open source set of metadata about artists, releases, and tracks courtesy of the MusicBrainz.org project. Using this data as a testing ground for Solr, you will learn how to import this data in various ways from CSV to XML to database access. You will then learn how to search this data in a myriad of ways, including Solr's rich query syntax, "boosting" match scores based on record data and other means, about searching across multiple fields with different boosts, getting facets on the results, auto-complete user queries, spell-correcting searches, highlighting queried text in search results, and so on. After this thorough tour, we'll demonstrate working examples of integrating a variety of technologies with Solr such as Java, JavaScript, Drupal, Ruby, XSLT, PHP, and Python. Finally, we'll cover various deployment considerations to include indexing strategies and performance-oriented configuration that will enable you to scale Solr to meet the needs of a high-volume site.

Solr 1.4 Enterprise Search Server

Credits

About the Authors

About the Reviewers

Preface

Free Chapter

Quick Starting Solr

An introduction to Solr

Comparison to database technology

Getting started

A quick tour of Solr!

The schema and configuration files

Solr resources outside this book

Summary

Schema and Text Analysis

MusicBrainz.org

One combined index or multiple indices

Schema design

The schema.xml file

Text analysis

Summary

Indexing Data

Communicating with Solr

Using curl to interact with Solr

Remote streaming

Sending XML to Solr

Sending CSV to Solr

Direct database and XML import

Indexing documents with Solr Cell

Summary

Basic Searching

Your first search, a walk-through

Solr's generic XML structured data representation

Solr's XML response format

Sorting

Scoring

Summary

Enhanced Searching

Function queries

Dismax Solr request handler

Faceting

Summary

Search Components

About components

The highlighting component

Query elevation

Spell checking

The more-like-this search component

Stats component

Field collapsing

Other components

Summary

Deployment

Implementation methodology

Installing into a Servlet container

Logging

A SearchHandler per search interface

Solr cores

JMX

Securing Solr

Summary

Integrating Solr

Structure of included examples

SolrJ: Simple Java interface

Using JavaScript to integrate Solr

Accessing Solr from PHP applications

Ruby on Rails integrations

Summary

Scaling Solr

Tuning complex systems

Optimizing a single Solr server (Scale High)

Moving to multiple Solr servers (Scale Wide)

Combining replication and sharding (Scale Deep)

Summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Logging

Solr's logging facility provides a wealth of information, from basic performance statistics, to what queries are being run, to any exceptions encountered by Solr. The log files should be one of the first places to look when you want to investigate any issues with your Solr deployment. There are two types of logs:

the HTTP server request style logs, which record the individual web requests coming into Solr
the application logging that uses SLF4J, which uses the built-in Java JDK logging facility to log the internal operations of Solr

HTTP server request access logs

The HTTP server request logs record the requests that come in and are defined by the Servlet container in which Solr is deployed. For example, the default configuration for managing the server logs in Jetty is defined in jetty.xml:

<Ref id="RequestLog">
<Set name="requestLog">
<New id="RequestLogImpl" class="org.mortbay.jetty.NCSARequestLog">
<Arg><SystemProperty name="jetty.logs" default="./logs...

Solr 1.4 Enterprise Search Server

By : David Smiley, Eric Pugh

Solr 1.4 Enterprise Search Server

By: David Smiley, Eric Pugh

Overview of this book

Related Content you might be interested in

Current Title:

Solr 1.4 Enterprise Search Server

Logging

HTTP server request access logs