Book Image

Learning Elasticsearch

By : Abhishek Andhavarapu
Book Image

Learning Elasticsearch

By: Abhishek Andhavarapu

Overview of this book

Elasticsearch is a modern, fast, distributed, scalable, fault tolerant, and open source search and analytics engine. You can use Elasticsearch for small or large applications with billions of documents. It is built to scale horizontally and can handle both structured and unstructured data. Packed with easy-to- follow examples, this book will ensure you will have a firm understanding of the basics of Elasticsearch and know how to utilize its capabilities efficiently. You will install and set up Elasticsearch and Kibana, and handle documents using the Distributed Document Store. You will see how to query, search, and index your data, and perform aggregation-based analytics with ease. You will see how to use Kibana to explore and visualize your data. Further on, you will learn to handle document relationships, work with geospatial data, and much more, with this easy-to-follow guide. Finally, you will see how you can set up and scale your Elasticsearch clusters in production environments.
Table of Contents (11 chapters)
10
Exploring Elastic Stack (Elastic Cloud, Security, Graph, and Alerting)

Doc values

Before we jump into doc values, let's quickly refresh what an inverted index is and why it is needed. Let's says we have the following documents:

  • Doc 1: Apple
  • Doc2: Apple
  • Doc3: Samsung

The inverted index for the preceding documents looks like the following:

Term Doc ID
Apple 1, 2
Samsung 3

To find all the products manufactured by Apple, we would simply use a match query as shown here:

{
"query": {
"match": {
"manufacturer": "Apple"
}
}
}

With the help of inverted index, we can quickly look up all the documents associated with term Apple. But if you want to sort or run the aggregation using the inverted index, we have to go through the entire terms list and collect the document IDs, which is practically not possible. To solve this problem, doc values are introduced. Doc values for the preceding...