Book Image

Machine Learning with the Elastic Stack

By : Rich Collier, Bahaaldine Azarmi
Book Image

Machine Learning with the Elastic Stack

By: Rich Collier, Bahaaldine Azarmi

Overview of this book

Machine Learning with the Elastic Stack is a comprehensive overview of the embedded commercial features of anomaly detection and forecasting. The book starts with installing and setting up Elastic Stack. You will perform time series analysis on varied kinds of data, such as log files, network flows, application metrics, and financial data. As you progress through the chapters, you will deploy machine learning within the Elastic Stack for logging, security, and metrics. In the concluding chapters, you will see how machine learning jobs can be automatically distributed and managed across the Elasticsearch cluster and made resilient to failure. By the end of this book, you will understand the performance aspects of incorporating machine learning within the Elastic ecosystem and create anomaly detection jobs and view results from Kibana directly.
Table of Contents (12 chapters)

Sizing ML deployments

People often ask how they should appropriately size their cluster if they plan on using Elastic ML. Other than the obvious it depends answer, it is useful to have an empirical approach to the process. As seen on the Elastic blog Sizing for Machine Learning with Elasticsearch (https://www.elastic.co/blog/sizing-machine-learning-with-elasticsearch), there is a key recommendation: use dedicated nodes for ML so that you don't have ML jobs interfere with the other tasks of the data nodes of a cluster (indexing, searching, and so on). To scope how many dedicated nodes are necessary, follow this approach:

  • If there are no representative jobs created yet, use generic rules of thumb based on the overall cluster size from the blog. These rules of thumb are as follows:
    • Recommend 1 dedicated ML node (2 for HA) for cluster sizes < 10 data nodes
    • Definitely at...