Book Image

Mastering Elasticsearch 5.x - Third Edition

Book Image

Mastering Elasticsearch 5.x - Third Edition

Overview of this book

Elasticsearch is a modern, fast, distributed, scalable, fault tolerant, and open source search and analytics engine. Elasticsearch leverages the capabilities of Apache Lucene, and provides a new level of control over how you can index and search even huge sets of data. This book will give you a brief recap of the basics and also introduce you to the new features of Elasticsearch 5. We will guide you through the intermediate and advanced functionalities of Elasticsearch, such as querying, indexing, searching, and modifying data. We’ll also explore advanced concepts, including aggregation, index control, sharding, replication, and clustering. We’ll show you the modules of monitoring and administration available in Elasticsearch, and will also cover backup and recovery. You will get an understanding of how you can scale your Elasticsearch cluster to contextualize it and improve its performance. We’ll also show you how you can create your own analysis plugin in Elasticsearch. By the end of the book, you will have all the knowledge necessary to master Elasticsearch and put it to efficient use.
Table of Contents (20 chapters)
Mastering Elasticsearch 5.x - Third Edition
Credits
About the Author
Acknowledgements
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Stripping data on multiple paths


The support for stripping data on more than one path has been available for a very long time now. But since version 2.0.0, it is no longer supported. Instead of data stripping on multiple paths, Elasticsearch now allows you to allocate different shards on different paths. The reason for removing data stripping was that a file from a single segment in a shard could be spread across multiple disks and failure of a single disk could corrupt multiple shards/indices.

The data path is configured inside the elasticsearch.yml file using the path.data parameter and similar to version 1.x, you can still use multiple data paths using comma separated values shown as follows:

path.data: /data_path1/,/data_path2/ 

In this way, all the files belonging to a single shard will be stored at the same path. The other important change based on disk allocation we have already discussed in this chapter, in the Disk-based allocation section, where we mentioned how Elasticsearch now...