Book Image

Monitoring Elasticsearch

By : Dan Noble, Pulkit Agrawal, Mahmoud Lababidi
Book Image

Monitoring Elasticsearch

By: Dan Noble, Pulkit Agrawal, Mahmoud Lababidi

Overview of this book

ElasticSearch is a distributed search server similar to Apache Solr with a focus on large datasets, a schema-less setup, and high availability. This schema-free architecture allows ElasticSearch to index and search unstructured content, making it perfectly suited for both small projects and large big data warehouses with petabytes of unstructured data. This book is your toolkit to teach you how to keep your cluster in good health, and show you how to diagnose and treat unexpected issues along the way. You will start by getting introduced to ElasticSearch, and look at some common performance issues that pop up when using the system. You will then see how to install and configure ElasticSearch and the ElasticSearch monitoring plugins. Then, you will proceed to install and use the Marvel dashboard to monitor ElasticSearch. You will find out how to troubleshoot some of the common performance and reliability issues that come up when using ElasticSearch. Finally, you will analyze your cluster’s historical performance, and get to know how to get to the bottom of and recover from system failures. This book will guide you through several monitoring tools, and utilizes real-world cases and dilemmas faced when using ElasticSearch, showing you how to solve them simply, quickly, and cleanly.
Table of Contents (15 chapters)
Monitoring Elasticsearch
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

The fielddata cache


A poorly configured Elasticsearch fielddata cache is often the reason for OutOfMemoryError exceptions.

When running a sort or aggregation (or facet) query, Elasticsearch fills the cache with all distinct field values from the query. This allows similar, subsequent queries to execute more quickly. However, Elasticsearch doesn't put an upper bound on the cache size by default; therefore, the data is not automatically evicted. If the cache causes the total JVM memory to fill up beyond the ES_HEAP size, the node will throw an OutOfMemoryError exception and will require an Elasticsearch restart.

To limit the fielddata cache size, set the indices.fielddata.cache.size value:

indices.fielddata.cache.size: 30%

This will limit the fielddata cache size to 30% of the available JVM heap space.

You can set this value to a fixed value as well. For example, setting it to 10gb will limit the cache size to no more than 10 gigabytes. The value that you choose will depend on the cluster and use...