Book Image

Apache Solr 4 Cookbook

By : Rafał Kuć
Book Image

Apache Solr 4 Cookbook

By: Rafał Kuć

Overview of this book

<p>Apache Solr is a blazing fast, scalable, open source Enterprise search server built upon Apache Lucene. Solr is wildly popular because it supports complex search criteria, faceting, result highlighting, query-completion, query spell-checking, and relevancy tuning, amongst other numerous features.<br /><br />"Apache Solr 4 Cookbook" will show you how to get the most out of your search engine. Full of practical recipes and examples, this book will show you how to set up Apache Solr, tune and benchmark performance as well as index and analyze your data to provide better, more precise, and useful search data.<br /><br />"Apache Solr 4 Cookbook" will make your search better, more accurate and faster with practical recipes on essential topics such as SolrCloud, querying data, search faceting, text and data analysis, and cache configuration.<br /><br />With numerous practical chapters centered on important Solr techniques and methods, Apache Solr 4 Cookbook is an essential resource for developers who wish to take their knowledge and skills further. Thoroughly updated and improved, this Cookbook also covers the changes in Apache Solr 4 including the awesome capabilities of SolrCloud.</p>
Table of Contents (18 chapters)
Apache Solr 4 Cookbook
Credits
About the Author
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface
Index

How to use Data Import Handler with the URL data source


Do you remember the first example with the Wikipedia data? We asked our fellow DB expert to import the data dump into PostgreSQL and we fetched the data from there. But what if our colleague is sick and can't help us, and we need to import that data? We can parse the data and send it to Solr, but that's not an option – we don't have much time to do that. So what to do? Yes, you guessed – we can use Data Import Handler and one of its data sources, file data source. This task will show you how to do that.

Getting ready

Please refer to the How to properly configure Data Import Handler recipe in this chapter to get to know the basics of the Data Import Handler configuration. I assume that Solr is set up according to the description given in the mentioned recipe.

How to do it...

Let's take a look at our data source. To be consistent, I chose to index the Wikipedia data, which you should already be familiar with.

  1. First of all, the index structure...