Book Image

Machine Learning Fundamentals

By : Hyatt Saleh
Book Image

Machine Learning Fundamentals

By: Hyatt Saleh

Overview of this book

As machine learning algorithms become popular, new tools that optimize these algorithms are also developed. Machine Learning Fundamentals explains you how to use the syntax of scikit-learn. You'll study the difference between supervised and unsupervised models, as well as the importance of choosing the appropriate algorithm for each dataset. You'll apply unsupervised clustering algorithms over real-world datasets, to discover patterns and profiles, and explore the process to solve an unsupervised machine learning problem. The focus of the book then shifts to supervised learning algorithms. You'll learn to implement different supervised algorithms and develop neural network structures using the scikit-learn package. You'll also learn how to perform coherent result analysis to improve the performance of the algorithm by tuning hyperparameters. By the end of this book, you will have gain all the skills required to start programming machine learning algorithms.
Table of Contents (9 chapters)
Machine Learning Fundamentals
Preface

Evaluation Metrics


Model evaluation is indispensable for creating effective models that not only perform well over the data that was used to train the model but also generalize to unseen data. The task of evaluating the model is especially easy when dealing with supervised learning problems, where there is a ground truth that can be compared against the prediction of the model.

Determining the accuracy percentage of the model is crucial for its application to unseen data that does not have a label class to compare to. Considering this, for example, a model with an accuracy of 98% may allow the user to assume that the odds of having an accurate prediction are high, and hence the model should be trusted.

The evaluation of performance, as mentioned previously, should be done over the validation set (dev set) for fine-tuning the model, and over the test set for determining the expected performance of the selected model over unseen data.

Evaluation Metrics for Classification Tasks

A classification...