Book Image

Solr 1.4 Enterprise Search Server

By : David Smiley, Eric Pugh
Book Image

Solr 1.4 Enterprise Search Server

By: David Smiley, Eric Pugh

Overview of this book

<p>If you are a developer building a high-traffic web site, you need to have a terrific search engine. Sites like Netflix.com and Zappos.com employ Solr, an open source enterprise search server, which uses and extends the Lucene search library. This is the first book in the market on Solr and it will show you how to optimize your web site for high volume web traffic with full-text search capabilities along with loads of customization options. So, let your users gain a terrific search experience.<br /><br />This book is a comprehensive reference guide for every feature Solr has to offer. It serves the reader right from initiation to development to deployment. It also comes with complete running examples to demonstrate its use and show how to integrate it with other languages and frameworks.<br /><br />This book first gives you a quick overview of Solr, and then gradually takes you from basic to advanced features that enhance your search. It starts off by discussing Solr and helping you understand how it fits into your architecture—where all databases and document/web crawlers fall short, and Solr shines. The main part of the book is a thorough exploration of nearly every feature that Solr offers. To keep this interesting and realistic, we use a large open source set of metadata about artists, releases, and tracks courtesy of the MusicBrainz.org project. Using this data as a testing ground for Solr, you will learn how to import this data in various ways from CSV to XML to database access. You will then learn how to search this data in a myriad of ways, including Solr's rich query syntax, "boosting" match scores based on record data and other means, about searching across multiple fields with different boosts, getting facets on the results, auto-complete user queries, spell-correcting searches, highlighting queried text in search results, and so on.<br /><br />After this thorough tour, we'll demonstrate working examples of integrating a variety of technologies with Solr such as Java, JavaScript, Drupal, Ruby, XSLT, PHP, and Python.<br /><br />Finally, we'll cover various deployment considerations to include indexing strategies and performance-oriented configuration that will enable you to scale Solr to meet the needs of a high-volume site.</p>
Table of Contents (15 chapters)
Solr 1.4 Enterprise Search Server
Credits
About the Authors
About the Reviewers
Preface
Index

Using curl to interact with Solr


Solr receives commands (and possibly the associated data) through HTTP POST.

Note

Solr lets you use HTTP GET too (for example, through your web browser). However, this is an inappropriate HTTP verb if it causes something to change on the server, as happens with indexing. For more information on this concept, read about REST at http://en.wikipedia.org/wiki/Representational_State_Transfer

One way to send an HTTP POST is through the Unix command line program curl (also available on Windows through Cygwin). Even if you don't use curl, it is very important to know how we're going to use it, because the concepts will be applied no matter how you make the HTTP messages.

There are several ways to tell Solr to index data, and all of them are through HTTP POST:

  • Send the data as the entire POST payload (only applicable to Solr's XML format). curl does this with data-binary (or some similar options) and an appropriate content-type header reflecting that it's XML.

  • Send some...