Book Image

Apache Solr 4 Cookbook

By : Rafał Kuć
Book Image

Apache Solr 4 Cookbook

By: Rafał Kuć

Overview of this book

<p>Apache Solr is a blazing fast, scalable, open source Enterprise search server built upon Apache Lucene. Solr is wildly popular because it supports complex search criteria, faceting, result highlighting, query-completion, query spell-checking, and relevancy tuning, amongst other numerous features.<br /><br />"Apache Solr 4 Cookbook" will show you how to get the most out of your search engine. Full of practical recipes and examples, this book will show you how to set up Apache Solr, tune and benchmark performance as well as index and analyze your data to provide better, more precise, and useful search data.<br /><br />"Apache Solr 4 Cookbook" will make your search better, more accurate and faster with practical recipes on essential topics such as SolrCloud, querying data, search faceting, text and data analysis, and cache configuration.<br /><br />With numerous practical chapters centered on important Solr techniques and methods, Apache Solr 4 Cookbook is an essential resource for developers who wish to take their knowledge and skills further. Thoroughly updated and improved, this Cookbook also covers the changes in Apache Solr 4 including the awesome capabilities of SolrCloud.</p>
Table of Contents (18 chapters)
Apache Solr 4 Cookbook
Credits
About the Author
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface
Index

Ignoring typos in terms of performance


Sometimes there are situations where you would like to have some kind of functionality that would allow you to give your user the search results even though he/she made a typo or even multiple typos. In Solr, there are multiple ways to undo that: using a spellchecker component to try and correct the user's mistake, using the fuzzy query, or for example, using the ngram approach. This recipe will concentrate on the third approach and show you how to use ngrams to handle user typos.

How to do it...

For the purpose of the recipe, let's assume that our index is built up of four fields: identifier, name, description, and the description_ngram field which will be processed with the ngram filter.

  1. So let's start with the fields definition of our index which should look like the following code (place this in your schema.xml file in the fields section):

    <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
    &lt...