Book Image

Apache Solr 4 Cookbook

By : Rafał Kuć
Book Image

Apache Solr 4 Cookbook

By: Rafał Kuć

Overview of this book

<p>Apache Solr is a blazing fast, scalable, open source Enterprise search server built upon Apache Lucene. Solr is wildly popular because it supports complex search criteria, faceting, result highlighting, query-completion, query spell-checking, and relevancy tuning, amongst other numerous features.<br /><br />"Apache Solr 4 Cookbook" will show you how to get the most out of your search engine. Full of practical recipes and examples, this book will show you how to set up Apache Solr, tune and benchmark performance as well as index and analyze your data to provide better, more precise, and useful search data.<br /><br />"Apache Solr 4 Cookbook" will make your search better, more accurate and faster with practical recipes on essential topics such as SolrCloud, querying data, search faceting, text and data analysis, and cache configuration.<br /><br />With numerous practical chapters centered on important Solr techniques and methods, Apache Solr 4 Cookbook is an essential resource for developers who wish to take their knowledge and skills further. Thoroughly updated and improved, this Cookbook also covers the changes in Apache Solr 4 including the awesome capabilities of SolrCloud.</p>
Table of Contents (18 chapters)
Apache Solr 4 Cookbook
Credits
About the Author
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface
Index

What to do when Solr slows down during indexing


One of the most common problems when indexing a vast amount of data is the indexing time. Some of the problems with indexing time are not easily resolvable, but others are. Imagine that you need to index about 300,000 documents that are in a single XML file. You run the post.sh bash script that is provided with Solr and you wait, wait, and wait. Something is wrong – when you index 10,000 documents you need about a minute, but now you are waiting about an hour and the commit operation didn't take place. Is there something we can do to speed it up? Sure, and this recipe will tell you how to.

How to do it...

The solution to the situation is very simple – just add the commit operation every now and then. But as you may have noticed, I mentioned that our data is written in a single XML file. So, how do we add the commit operation to that kind of data? Send it in parallel to the indexing process? No, we need to enable the auto commit mechanism. To...