Elasticsearch is a search engine that is built on top of Lucene and passes data to Lucene for storage and searching. Lucene data structures can perform in a better way if the data is stored in dense form; for example, all documents with the same type of fields can create a dense storage rather than storing different types of field in a single document. Lucene identifies documents with doc_id
, which has an integer value and varies from 0 to total number of documents in the index. This is how Lucene recognizes Elasticsearch document in the index. These doc_id
elements of Elasticsearch documents are used to communicate with Lucene's internal APIs.
For example, if we execute a match query on any term, Lucene will produce an iterator of doc_ids
. These doc_ids
elements are used to compute the score for the document in the search by retrieving the value of the norm. One byte is reserved for each document to store the norm value, and this is the current norm lookup implementation...