Now that we've seen numerous examples on Lucene's built-in Filters, we are ready for a more advanced topic, custom filters. There are a few important components we need to go over before we start: FieldCache
, SortedDocValues
, and DocIdSet
. We will be using these items in our example to help you gain practical knowledge on the subject.
In the FieldCache
, as you already learned, is a cache that stores field values in memory in an array structure. It's a very simple data structure as the slots in the array basically correspond to DocIds. This is also the reason why FieldCache
only works for a single-valued field. A slot in an array can only hold a single value. Since this is just an array, the lookup time is constant and very fast.
The SortedDocValues
has two internal data mappings for values' lookup: a dictionary mapping an ordinal value to a field value and a DocId to an ordinal value (for the field value) mapping. In the dictionary data structure, the values are deduplicated...