Book Image

Mastering Elasticsearch 5.x - Third Edition

Book Image

Mastering Elasticsearch 5.x - Third Edition

Overview of this book

Elasticsearch is a modern, fast, distributed, scalable, fault tolerant, and open source search and analytics engine. Elasticsearch leverages the capabilities of Apache Lucene, and provides a new level of control over how you can index and search even huge sets of data. This book will give you a brief recap of the basics and also introduce you to the new features of Elasticsearch 5. We will guide you through the intermediate and advanced functionalities of Elasticsearch, such as querying, indexing, searching, and modifying data. We’ll also explore advanced concepts, including aggregation, index control, sharding, replication, and clustering. We’ll show you the modules of monitoring and administration available in Elasticsearch, and will also cover backup and recovery. You will get an understanding of how you can scale your Elasticsearch cluster to contextualize it and improve its performance. We’ll also show you how you can create your own analysis plugin in Elasticsearch. By the end of the book, you will have all the knowledge necessary to master Elasticsearch and put it to efficient use.
Table of Contents (20 chapters)
Mastering Elasticsearch 5.x - Third Edition
Credits
About the Author
Acknowledgements
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Shard allocation control


One of the primary roles of the master node in an Elasticsearch cluster is allocation of shards on nodes and moving shards from one node to another for balancing the cluster state. In this section, we will see the available settings to control this allocation process.

Allocation awareness

Allocation awareness allows us to configure shards and their replicas' allocation with the use of generic parameters. This comes very handy in cases where you are running the cluster on multiple VMs on the same physical server, on multiple racks, or across multiple availability zones. In these scenarios, if more than one node on the same physical server, same rack, or same availability zone goes down, there would be a huge problem. Shard allocation awareness helps in ensuring the high availability by tagging instances, so that primaries and replicas are spread across different zones/racks.

In order to illustrate how allocation awareness works, we assume that we have a cluster built...