Book Image

Elasticsearch Indexing

By : Huseyin Akdogan
Book Image

Elasticsearch Indexing

By: Huseyin Akdogan

Overview of this book

Beginning with an overview of the way ElasticSearch stores data, you’ll begin to extend your knowledge to tackle indexing and mapping, and learn how to configure ElasticSearch to meet your users’ needs. You’ll then find out how to use analysis and analyzers for greater intelligence in how you organize and pull up search results – to guarantee that every search query is met with the relevant results! You’ll explore the anatomy of an ElasticSearch cluster, and learn how to set up configurations that give you optimum availability as well as scalability. Once you’ve learned how these elements work, you’ll find real-world solutions to help you improve indexing performance, as well as tips and guidance on safety so you can back up and restore data. Once you’ve learned each component outlined throughout, you will be confident that you can help to deliver an improved search experience – exactly what modern users demand and expect.
Table of Contents (15 chapters)
Elasticsearch Indexing
About the Author
About the Reviewer

Process of analysis

We mentioned in Chapter 1, Introduction to Efficient Indexing and Chapter 2, What is an Elasticsearch Index that all Apache Lucene's data is stored in the inverted index. This means that the data is being transformed. The process of transforming data is called analysis. The analysis process relies on two basic pillars: tokenizing and normalizing.

The first step of the analysis process is to break the text into tokens using tokenizer after processing by the character filters for the inverted index. Then, it normalizes these tokens (that is, terms) to make them easily searchable.

Inverted index processes are performed by analyzers. Generally, an analyzer is composed of a tokenizer and one or more token filters. During the indexing time, when Elasticsearch processes a field that must be indexed, it checks whether an analyzer is defined at several levels or not because an analyzer can be specified at several levels.

The check order is as follows:

  1. At field level

  2. At type level

  3. At...