Book Image

Elasticsearch Essentials

Book Image

Elasticsearch Essentials

Overview of this book

With constantly evolving and growing datasets, organizations have the need to find actionable insights for their business. ElasticSearch, which is the world's most advanced search and analytics engine, brings the ability to make massive amounts of data usable in a matter of milliseconds. It not only gives you the power to build blazing fast search solutions over a massive amount of data, but can also serve as a NoSQL data store. This guide will take you on a tour to become a competent developer quickly with a solid knowledge level and understanding of the ElasticSearch core concepts. Starting from the beginning, this book will cover these core concepts, setting up ElasticSearch and various plugins, working with analyzers, and creating mappings. This book provides complete coverage of working with ElasticSearch using Python and performing CRUD operations and aggregation-based analytics, handling document relationships in the NoSQL world, working with geospatial data, and taking data backups. Finally, we’ll show you how to set up and scale ElasticSearch clusters in production environments as well as providing some best practices.
Table of Contents (18 chapters)
Elasticsearch Essentials
Credits
About the Author
Acknowledgments
About the Reviewer
www.PacktPub.com
Preface
Index

Considerations for using document relationships


Over the years, Elasticsearch has improved a lot in reducing memory pressure by introducing doc_values, which is a little slower than the in-memory data structure, fielddata, but still offer reasonable speed and performance. However, because of the way nested and parent-child documents are stored and searched, you should keep the pros and cons in mind before modeling your data. The following is a comparison of nested versus parent-child types, which is nicely outlined by Zachary Tong in one of his articles:

Nested

Parent-Child

Stored in the same Lucene block as each other, which helps in a faster read/query performance. Reading a nested doc is faster than the equivalent parent/child.

Children are stored separately from the parent, but are routed to the same shard. So parent/children performance is slightly less on read/query than nested.

Updating a single field in a nested document (parent or nested children) forces ES to re-index the entire...