Book Image

Mastering Elasticsearch 5.x - Third Edition

Book Image

Mastering Elasticsearch 5.x - Third Edition

Overview of this book

Elasticsearch is a modern, fast, distributed, scalable, fault tolerant, and open source search and analytics engine. Elasticsearch leverages the capabilities of Apache Lucene, and provides a new level of control over how you can index and search even huge sets of data. This book will give you a brief recap of the basics and also introduce you to the new features of Elasticsearch 5. We will guide you through the intermediate and advanced functionalities of Elasticsearch, such as querying, indexing, searching, and modifying data. We’ll also explore advanced concepts, including aggregation, index control, sharding, replication, and clustering. We’ll show you the modules of monitoring and administration available in Elasticsearch, and will also cover backup and recovery. You will get an understanding of how you can scale your Elasticsearch cluster to contextualize it and improve its performance. We’ll also show you how you can create your own analysis plugin in Elasticsearch. By the end of the book, you will have all the knowledge necessary to master Elasticsearch and put it to efficient use.
Table of Contents (20 chapters)
Mastering Elasticsearch 5.x - Third Edition
Credits
About the Author
Acknowledgements
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Understanding Elasticsearch caching


One of the very important parts of Elasticsearch, although not always visible to the users, is caching. It allows Elasticsearch to store commonly used data in memory and reuse it on demand. Of course, we can't cache everything; we usually have way more data than we have memory, and creating caches may be quite expensive when it comes to performance. In the Instant aggregations in Elasticsearch 5.0 section of Chapter 5, Improving the User Search Experience, we discussed some major improvements done in query parsing and caching. In this chapter, we will look at the different caches exposed by Elasticsearch, and we will discuss how they are used and how we can control their usage. Hopefully, such information will allow you to better understand how this great search server works internally.

Node query cache

The query cache is responsible for caching the results of queries. There is one queries cache per node that is shared by all shards existing on that node...