Book Image

Machine Learning with the Elastic Stack - Second Edition

By : Rich Collier, Camilla Montonen, Bahaaldine Azarmi
5 (1)
Book Image

Machine Learning with the Elastic Stack - Second Edition

5 (1)
By: Rich Collier, Camilla Montonen, Bahaaldine Azarmi

Overview of this book

Elastic Stack, previously known as the ELK stack, is a log analysis solution that helps users ingest, process, and analyze search data effectively. With the addition of machine learning, a key commercial feature, the Elastic Stack makes this process even more efficient. This updated second edition of Machine Learning with the Elastic Stack provides a comprehensive overview of Elastic Stack's machine learning features for both time series data analysis as well as for classification, regression, and outlier detection. The book starts by explaining machine learning concepts in an intuitive way. You'll then perform time series analysis on different types of data, such as log files, network flows, application metrics, and financial data. As you progress through the chapters, you'll deploy machine learning within Elastic Stack for logging, security, and metrics. Finally, you'll discover how data frame analysis opens up a whole new set of use cases that machine learning can help you with. By the end of this Elastic Stack book, you'll have hands-on machine learning and Elastic Stack experience, along with the knowledge you need to incorporate machine learning in your distributed search and data analysis platform.
Table of Contents (19 chapters)
Section 1 – Getting Started with Machine Learning with Elastic Stack
Section 2 – Time Series Analysis – Anomaly Detection and Forecasting
Section 3 – Data Frame Analysis

Avoiding the over-engineering of a use case

I once worked with a user where we discussed different use cases for anomaly detection. In particular, this customer was building a hosted security operations center as part of their managed security service provider (MSSP) business, so they were keen to think about use cases in which ML could help.

A high-level theme to their use cases was to look at a user's behavior and find unexpected behavior. One example that was discussed was login activity from unusual/rare locations such as Bob just logged in from Ukraine, but he doesn't normally log in from there.

In the process of thinking the implementation through, there was talk of them having multiple clients, each of which had multiple users. Therefore, they were thinking of ways to split/partition the data so that they could execute rare by country for each and every user of every client.

I asked them to take a step back and said, "Is it worthy of an anomaly if anyone...