Book Image

Mastering Elasticsearch - Second Edition

Book Image

Mastering Elasticsearch - Second Edition

Overview of this book

Table of Contents (19 chapters)
Mastering Elasticsearch Second Edition
Credits
About the Author
Acknowledgments
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Altering Apache Lucene scoring


With the release of Apache Lucene 4.0 in 2012, all the users of this great full text search library were given the opportunity to alter the default TF/IDF-based algorithm. The Lucene API was changed to allow easier modification and extension of the scoring formula. However, this was not the only change that was made to Lucene when it comes to documents' score calculation. Lucene 4.0 was shipped with additional similarity models, which basically allows us to use a different scoring formula for our documents. In this section, we will take a deeper look at what Lucene 4.0 brings and how these features were incorporated into Elasticsearch.

Available similarity models

As already mentioned, the original and default similarity model available before Apache Lucene 4.0 was the TF/IDF model. We already discussed it in detail in the Default Apache Lucene scoring explained section in Chapter 2, Power User Query DSL.

The five new similarity models that we can use are:

  • Okapi...