There are many real-life situations when you have to clean your data. Let's assume that you want to index web pages that your client sends you. You don't know anything about the structure of that page; one thing you know is that you must provide a search mechanism that will enable searching through the content of the pages. Of course you could index the whole page, splitting it by whitespaces, but then you would probably hear the clients complain about the HTML tags being searchable and so on. So before we enable searching the contents of the page, we need to clean the data. In this example we need to remove the HTML tags. This recipe will show you how to do it with Solr.
Apache Solr 4 Cookbook
By :
Apache Solr 4 Cookbook
By:
Overview of this book
<p>Apache Solr is a blazing fast, scalable, open source Enterprise search server built upon Apache Lucene. Solr is wildly popular because it supports complex search criteria, faceting, result highlighting, query-completion, query spell-checking, and relevancy tuning, amongst other numerous features.<br /><br />"Apache Solr 4 Cookbook" will show you how to get the most out of your search engine. Full of practical recipes and examples, this book will show you how to set up Apache Solr, tune and benchmark performance as well as index and analyze your data to provide better, more precise, and useful search data.<br /><br />"Apache Solr 4 Cookbook" will make your search better, more accurate and faster with practical recipes on essential topics such as SolrCloud, querying data, search faceting, text and data analysis, and cache configuration.<br /><br />With numerous practical chapters centered on important Solr techniques and methods, Apache Solr 4 Cookbook is an essential resource for developers who wish to take their knowledge and skills further. Thoroughly updated and improved, this Cookbook also covers the changes in Apache Solr 4 including the awesome capabilities of SolrCloud.</p>
Table of Contents (18 chapters)
Apache Solr 4 Cookbook
Credits
About the Author
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface
Free Chapter
Apache Solr Configuration
Indexing Your Data
Analyzing Your Text Data
Querying Solr
Using the Faceting Mechanism
Improving Solr Performance
In the Cloud
Using Additional Solr Functionalities
Dealing with Problems
Real-life Situations
Index
Customer Reviews