The information-based model in Lucene consists of three components: Distribution, Lambda, and Normalization. The setup is somewhat similar to DFRSimilarity
where you need to instantiate these components in its constructor. The name of the Similarity class for this model is called IBSimilarity. Here is an excerpt from Lucene's Javadoc on the components:
Distribution: This is probabilistic distribution used to model term occurrence:
DistributionLL
: This is the Log-logistic distributionDistributionSPL
: This is the Smoothed power-law distribution
Lambda: This is the λw parameter of the probability distribution:
LambdaDF
: This is the now/nor average number of documents where w occursLambdaTTF
: This is the Fw/Nor average number of occurrences of w in the collection
Normalization: This is term frequency normalization:
NormalizationH1
: In this, there is a uniform distribution of term frequencyNormalizationH2
: In this, term frequency density is inversely...