Book Image

Mastering Elasticsearch 5.x - Third Edition

Book Image

Mastering Elasticsearch 5.x - Third Edition

Overview of this book

Elasticsearch is a modern, fast, distributed, scalable, fault tolerant, and open source search and analytics engine. Elasticsearch leverages the capabilities of Apache Lucene, and provides a new level of control over how you can index and search even huge sets of data. This book will give you a brief recap of the basics and also introduce you to the new features of Elasticsearch 5. We will guide you through the intermediate and advanced functionalities of Elasticsearch, such as querying, indexing, searching, and modifying data. We’ll also explore advanced concepts, including aggregation, index control, sharding, replication, and clustering. We’ll show you the modules of monitoring and administration available in Elasticsearch, and will also cover backup and recovery. You will get an understanding of how you can scale your Elasticsearch cluster to contextualize it and improve its performance. We’ll also show you how you can create your own analysis plugin in Elasticsearch. By the end of the book, you will have all the knowledge necessary to master Elasticsearch and put it to efficient use.
Table of Contents (13 chapters)

Summary


In this chapter, we discussed ingest nodes which help us to preprocess and enrich the data within the Elasticsearch cluster itself before the actual indexing takes place. We also covered the concept of federated search in Elasticsearch and how it can be achieved with the help of tribe nodes.

Our next chapter is dedicated to Elasticsearch performance improvements under different loads and the right way of scaling production clusters along with covering insights into garbage collection and hot thread issues and how to deal with them. We will also talk about query profiling and query benchmarking to know which part of the query is taking more time to execute. In the end, we will talk about general Elasticsearch cluster tuning advice under high query rate scenarios versus high indexing throughput scenarios.