Book Image

Elasticsearch 8.x Cookbook - Fifth Edition

By : Alberto Paro
Book Image

Elasticsearch 8.x Cookbook - Fifth Edition

By: Alberto Paro

Overview of this book

Elasticsearch is a Lucene-based distributed search engine at the heart of the Elastic Stack that allows you to index and search unstructured content with petabytes of data. With this updated fifth edition, you'll cover comprehensive recipes relating to what's new in Elasticsearch 8.x and see how to create and run complex queries and analytics. The recipes will guide you through performing index mapping, aggregation, working with queries, and scripting using Elasticsearch. You'll focus on numerous solutions and quick techniques for performing both common and uncommon tasks such as deploying Elasticsearch nodes, using the ingest module, working with X-Pack, and creating different visualizations. As you advance, you'll learn how to manage various clusters, restore data, and install Kibana to monitor a cluster and extend it using a variety of plugins. Furthermore, you'll understand how to integrate your Java, Scala, Python, and big data applications such as Apache Spark and Pig with Elasticsearch and create efficient data applications powered by enhanced functionalities and custom plugins. By the end of this Elasticsearch cookbook, you'll have gained in-depth knowledge of implementing the Elasticsearch architecture and be able to manage, search, and store data efficiently and effectively using Elasticsearch.
Table of Contents (20 chapters)

Chapter 12: Using the Ingest Module

Elasticsearch 8.x introduces a set of powerful functionalities that target the problems that arise during the ingestion of documents via the ingest node.

In Chapter 1, Getting Started, we discussed that the Elasticsearch node can have different roles and the main important ones are master, data, and ingest; the idea of splitting the ingest component from the others is to create a more stable cluster due to problems that can arise when preprocessing documents (mainly due to the custom plugin used in the ingest part, which could require restarting the ingest nodes to be updated).

To create a more stable cluster, the ingest nodes should be isolated by the master nodes (and possibly also from the data ones) in case some problems occur, such as a crash due to plugins such as the attachment plugin and high loads due to complex type manipulation.

The ingestion node can replace a Logstash installation in simple scenarios.

In this chapter, we...