Hands-On Machine Learning with ML.NET

By : Jarred Capellman

Hands-On Machine Learning with ML.NET

By: Jarred Capellman

Overview of this book

Machine learning (ML) is widely used in many industries such as science, healthcare, and research and its popularity is only growing. In March 2018, Microsoft introduced ML.NET to help .NET enthusiasts in working with ML. With this book, you’ll explore how to build ML.NET applications with the various ML models available using C# code. The book starts by giving you an overview of ML and the types of ML algorithms used, along with covering what ML.NET is and why you need it to build ML apps. You’ll then explore the ML.NET framework, its components, and APIs. The book will serve as a practical guide to helping you build smart apps using the ML.NET library. You’ll gradually become well versed in how to implement ML algorithms such as regression, classification, and clustering with real-world examples and datasets. Each chapter will cover the practical implementation, showing you how to implement ML within .NET applications. You’ll also learn to integrate TensorFlow in ML.NET applications. Later you’ll discover how to store the regression model housing price prediction result to the database and display the real-time predicted results from the database on your web application using ASP.NET Core Blazor and SignalR. By the end of this book, you’ll have learned how to confidently perform basic to advanced-level machine learning tasks in ML.NET.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Section 1: Fundamentals of Machine Learning and ML.NET

Free Chapter

Getting Started with Machine Learning and ML.NET

The importance of learning about machine learning today

The model building process

Exploring types of learning

Exploring various machine learning algorithms

What is ML.NET?

Summary

Setting Up the ML.NET Environment

Setting up your development environment

Creating your first ML.NET application

Evaluating the model

Summary

Section 2: ML.NET Models

Regression Model

Breaking down regression models

Creating the linear regression application

Creating the logistic regression application

Evaluating a regression model

Summary

Classification Model

Breaking down classification models

Creating a binary classification application

Creating a multi-class classification application

Evaluating a classification model

Summary

Clustering Model

Breaking down the k-means algorithm

Creating the clustering application

Evaluating a k-means model

Summary

Anomaly Detection Model

Breaking down anomaly detection

Creating a time series application

Creating an anomaly detection application

Evaluating a randomized PCA model

Summary

Matrix Factorization Model

Breaking down matrix factorizations

Creating a matrix factorization application

Evaluating a matrix factorization model

Summary

Section 3: Real-World Integrations with ML.NET

Using ML.NET with .NET Core and Forecasting

Breaking down the .NET Core application architecture

Creating the stock price estimator application

Exploring additional production application enhancements

Summary

Using ML.NET with ASP.NET Core

Breaking down ASP.NET Core

Creating the file classification web application

Exploring additional ideas for improvements

Summary

Using ML.NET with UWP

Breaking down the UWP architecture

Creating the web browser classification application

Additional ideas for improvements

Summary

Section 4: Extending ML.NET

Training and Building Production Models

Investigating feature engineering

Obtaining training and testing datasets

Creating your model-building pipeline

Summary

Using TensorFlow with ML.NET

Breaking down Google's Inception model

Creating the WPF image classification application

Additional ideas for improvements

Summary

Using ONNX with ML.NET

Breaking down ONNX and YOLO

Creating the ONNX object detection application

Exploring additional production application enhancements

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

The model building process

Before diving into ML.NET, an understanding of core machine learning concepts is required. These concepts will help create a foundation for you to build on as we start building models and learning the various algorithms ML.NET provides over the course of this book. At a high level, producing a model is a complex process; however, it can be broken down into six main steps:

Over the next few sections, we will go through each of these steps in detail to provide you with a clear understanding of how to perform each step and how each step relates to the overall machine learning process as a whole.

Defining your problem statement

Effectively, what problem are you attempting to solve? Being specific at this point is crucial as a less concise problem can lead to considerable re-work. For example, take the following problem statement: Predicting the outcome of an election. My first question upon hearing that problem statement would be, at what level? County, state, or national? Each level more than likely requires considerably more features and data to properly predict than the last. A better problem statement, especially early on in your machine learning journey, would be for a specific position at a county level, such as Predicting the 2020 John Doe County Mayor. With this more direct problem statement, your features and dataset are much more focused and more than likely attainable. Even with more experience in machine learning, proper scoping of your problem statement is critical. The five Ws of Who, What, When, Where, and Why should be followed to keep your statement concise.

Defining your features

The second step in machine learning is defining your features. Think of features as components or attributes of the problem you wish to solve. In machine learning – specifically, when creating a new model – features are one of the biggest impacts on your model's performance. Properly thinking through your problem statement will promote an initial set of features that will drive differentiation between your dataset and model results. Going back to the Mayor example in the preceding section, what features would you consider data points for the citizen? Perhaps start by looking at the Mayor's competition and where he/she sits on issues in ways that differ from other candidates. These values could be turned into features and then made into a poll for citizens of John Doe County to answer. Using these data points would create a solid first pass at features. One aspect here that is also found in model building is running several iterations of feature engineering and model training, especially as your dataset grows. After model evaluation, feature importance is used to determine what features are actually driving your predictions. Occasionally, you will find that gut-instinct features can actually be inconsequential after a few iterations of model training and feature engineering.

In Chapter 11, Training and Building Production Models, we will deep dive into best practices when defining features and common approaches to complex problems to obtain a solid first pass at feature engineering.

Obtaining a dataset

As you can imagine, one of the most important aspects of the model building process is obtaining a high-quality dataset. A dataset is used to train the model on what the output should be in the case of the aforementioned case of supervised learning. In the case of unsupervised learning, labeling is required for the dataset. A common misconception when creating a dataset is that bigger is better. This is far from the truth in a lot of cases. Continuing the preceding example, what if all of the poll results answered the same way for every single question? At that point, your dataset is composed of all the same data points and your model will not be able to properly predict any of the other candidates. This outcome is called overfitting. A diverse but representative dataset is required for machine learning algorithms to properly build a production-ready model.

In Chapter 11, Training and Building Production Models, we will deep dive into the methodology of obtaining quality datasets, looking at helpful resources, ways to manage your datasets, and transforming data, commonly referred to as data wrangling.

Feature extraction and pipeline

Once your features and datasets have been obtained, the next step is to perform feature extraction. Feature extraction, depending on the size of your dataset and your features, could be one of the most time-consuming elements of the model building process.

For example, let's say that the results from the aforementioned fictitious John Doe County Election Poll had 40,000 responses. Each response was stored in a SQL database captured from a web form. Performing a SQL query, let's say you then returned all of the data into a CSV file, using which your model can be trained. At a high level, this is your feature extraction and pipeline. For more complex scenarios, such as predicting malicious web content or image classification, the extraction will include binary extraction of specific bytes in files. Properly storing this data to avoid having to re-run the extraction is crucial to iterating quickly (assuming the features did not change).

In Chapter 11, Training and Building Production Models, we will deep dive into ways to version your feature-extracted data and maintain control over your data, especially as your dataset grows in size.

Model training

After feature extraction, you are now prepared to train your model. Model training with ML.NET, thankfully, is very straightforward. Depending on the amount of data extracted in the feature extraction phase, the complexity of the pipeline, and the specifications of the host machine, this step could take several hours to complete. When your pipeline becomes much larger and your model becomes more complex, you may find yourself requiring potentially more compute resources than your laptop or desktop can provide; tooling such as Spark exists to help you scale to n number of nodes.

In Chapter 11, Training and Building Production Models, we will discuss tooling and tips for scaling this step using an easy-to-use open source project.

Model evaluation

Once the model is trained, the last step is to evaluate the model. The typical approach to model evaluation is to hold out a portion of your dataset for evaluation. The idea behind this is to take known data, submit it to your trained model, and measure the efficacy of your model. The critical part of this step is to hold out a representative dataset of your data. If your holdout set is swayed one way or the other, then you will more than likely get a false sense of either high performance or low performance. In the next chapter, we will deep dive into the various scoring and evaluation metrics. ML.NET provides a relatively easy interface to evaluate a model; however, each algorithm has unique properties to verify, which we will review as we deep dive into the various algorithms.

Hands-On Machine Learning with ML.NET

By : Jarred Capellman

Hands-On Machine Learning with ML.NET

By: Jarred Capellman

Overview of this book

Related Content you might be interested in

Current Title:

Hands-On Machine Learning with ML.NET

The model building process

Defining your problem statement

Defining your features

Obtaining a dataset

Feature extraction and pipeline

Model training

Model evaluation