Book Image

Machine Learning with the Elastic Stack

By : Rich Collier, Bahaaldine Azarmi
Book Image

Machine Learning with the Elastic Stack

By: Rich Collier, Bahaaldine Azarmi

Overview of this book

Machine Learning with the Elastic Stack is a comprehensive overview of the embedded commercial features of anomaly detection and forecasting. The book starts with installing and setting up Elastic Stack. You will perform time series analysis on varied kinds of data, such as log files, network flows, application metrics, and financial data. As you progress through the chapters, you will deploy machine learning within the Elastic Stack for logging, security, and metrics. In the concluding chapters, you will see how machine learning jobs can be automatically distributed and managed across the Elasticsearch cluster and made resilient to failure. By the end of this book, you will understand the performance aspects of incorporating machine learning within the Elastic ecosystem and create anomaly detection jobs and view results from Kibana directly.
Table of Contents (12 chapters)

ML job throughput considerations

ML is awesome, and is no doubt very fast and scalable, but there will still be a practical upper bound of events/second processed to any ML job, depending on a couple of different factors:

  • The speed at which data can be delivered to the ML algorithms (that is, query performance)
  • The speed at which the ML algorithms can chew through the data, given the desired analysis

For the latter, much of the performance is based upon the following:

  • The function(s) chosen for the analysis, that is, count is faster than lat_long
  • The chosen bucket_span (longer bucket spans are faster than smaller bucket spans because more buckets analyzed per unit of time compound the per-bucket processing overhead that's writing results and so on)

However, if you have a defined analysis setup and can't really change it for other reasons, then there's not really...