Speeding up indexing with Solr segment merge tuning
During indexing, Solr (actually Lucene) creates a series of new index files—the segments. Each segment is written once and read many times, which means that once it is written, it cannot be changed (although some data can be changed, such as delete document markings or numerical doc values). After some time, Solr will try to merge multiple small segments into bigger ones. This is because the more segments the index is built of, the slower the queries will be. Of course, we have the ability to force segment merge (by running the force merge
command), but such an operation is resource intensive, because Lucene will rewrite the index segments. Because of that, Solr allows you to tune the segment merge process to match our needs, and this recipe will show you how to do that.
How to do it...
The merge policy is what controls how merges are done in Apache Lucene and thus in Solr. By default, the merge policy in not explicitly defined in Solr and...