Book Image

Learning Elastic Stack 6.0

By : Pranav Shukla, Sharath Kumar M N
Book Image

Learning Elastic Stack 6.0

By: Pranav Shukla, Sharath Kumar M N

Overview of this book

The Elastic Stack is a powerful combination of tools for distributed search, analytics, logging, and visualization of data from medium to massive data sets. The newly released Elastic Stack 6.0 brings new features and capabilities that empower users to find unique, actionable insights through these techniques. This book will give you a fundamental understanding of what the stack is all about, and how to use it efficiently to build powerful real-time data processing applications. After a quick overview of the newly introduced features in Elastic Stack 6.0, you’ll learn how to set up the stack by installing the tools, and see their basic configurations. Then it shows you how to use Elasticsearch for distributed searching and analytics, along with Logstash for logging, and Kibana for data visualization. It also demonstrates the creation of custom plugins using Kibana and Beats. You’ll find out about Elastic X-Pack, a useful extension for effective security and monitoring. We also provide useful tips on how to use the Elastic Cloud and deploy the Elastic Stack in production environments. On completing this book, you’ll have a solid foundational knowledge of the basic Elastic Stack functionalities. You’ll also have a good understanding of the role of each component in the stack to solve different data processing problems.
Table of Contents (19 chapters)
Title Page
Credits
Disclaimer
About the Authors
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Modeling time series data


Often, we have a need to store time series data in Elasticsearch. Typically, one would create a single index to hold all documents. This typical approach of one big index to hold all documents has its own limitations, especially for the following reasons:

  • Scaling the index with an unpredictable volume over time
  • Changing the mapping over time
  • Automatically deleting older documents

Let's look at how each problem manifests itself when we choose a single monolithic index.

Scaling the index with unpredictable volume over time

One of the most difficult choices when creating an Elasticsearch cluster and its indices is deciding how many primary shards should be created and how many replica shards should be created.

Let's understand how the number of shards becomes important in the following sub sections:

  • Unit of parallelism in Elasticsearch:
    • The effect of the number of shards on the relevance score
    • The effect of the number of shards on the accuracy of aggregations

Unit of parallelism...