Book Image

Mastering Elasticsearch 5.x - Third Edition

Book Image

Mastering Elasticsearch 5.x - Third Edition

Overview of this book

Elasticsearch is a modern, fast, distributed, scalable, fault tolerant, and open source search and analytics engine. Elasticsearch leverages the capabilities of Apache Lucene, and provides a new level of control over how you can index and search even huge sets of data. This book will give you a brief recap of the basics and also introduce you to the new features of Elasticsearch 5. We will guide you through the intermediate and advanced functionalities of Elasticsearch, such as querying, indexing, searching, and modifying data. We’ll also explore advanced concepts, including aggregation, index control, sharding, replication, and clustering. We’ll show you the modules of monitoring and administration available in Elasticsearch, and will also cover backup and recovery. You will get an understanding of how you can scale your Elasticsearch cluster to contextualize it and improve its performance. We’ll also show you how you can create your own analysis plugin in Elasticsearch. By the end of the book, you will have all the knowledge necessary to master Elasticsearch and put it to efficient use.
Table of Contents (20 chapters)
Mastering Elasticsearch 5.x - Third Edition
Credits
About the Author
Acknowledgements
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Data modeling techniques in Elasticsearch


Defining the structure of data is one of the key things to getting the search speed right, as well as making updates easier and non-expensive. If we compare it to the SQL world, most of the NoSQL solutions fail to provide relational mappings and queries. Elasticsearch, in spite of being a NoSQL document store, provides some ways to manage this relational data. However, there are always some trade-offs which we must be aware of before choosing a solution for defining the schema of the index. There are primarily four ways to define document structure in Elasticsearch:

  • Flat structure (application side joins)

  • Data denormalization

  • Nested objects

  • Parent-child relationships

Flat structures, In flat structures, we index the documents in simple key-value pairs or sometimes in the form of plain objects; these are the simplest and fastest ones. Storing data in this format allows for faster indexing as well as faster query execution. But it is hard to maintain the...