Machine Learning Quick Reference

Machine Learning Quick Reference

By : Rahul Kumar

Buy this Book

Machine Learning Quick Reference

By: Rahul Kumar

Buy this Book

Overview of this book

Machine learning makes it possible to learn about the unknowns and gain hidden insights into your datasets by mastering many tools and techniques. This book guides you to do just that in a very compact manner. After giving a quick overview of what machine learning is all about, Machine Learning Quick Reference jumps right into its core algorithms and demonstrates how they can be applied to real-world scenarios. From model evaluation to optimizing their performance, this book will introduce you to the best practices in machine learning. Furthermore, you will also look at the more advanced aspects such as training neural networks and work with different kinds of data, such as text, time-series, and sequential data. Advanced methods and techniques such as causal inference, deep Gaussian processes, and more are also covered. By the end of this book, you will be able to train fast, accurate machine learning models at your fingertips, which you can easily use as a point of reference.

Title Page

About Packt

Contributors

Preface

Free Chapter

Quantifying Learning Algorithms

Statistical models

Learning curve

Curve fitting

Statistical modeling – the two cultures of Leo Breiman

Training data development data – test data

Bias-variance trade off

Regularization

Cross-validation and model selection

Model selection using cross-validation

0.632 rule in bootstrapping

Model evaluation

Receiver operating characteristic curve

H-measure

Dimensionality reduction

Summary

Evaluating Kernel Learning

Introduction to vectors

SVM

SVM example and parameter optimization through grid search

Summary

Performance in Ensemble Learning

What is ensemble learning?

Bagging

Decision tree

Random forest algorithm

Boosting

Summary

Training Neural Networks

Neural networks

Network initialization

Overfitting

Prevention of overfitting in NNs

Vanishing gradient

Recurrent neural networks

Summary

Time Series Analysis

Introduction to time series analysis

Autoregressive integrated moving average

Optimization of parameters

Anomaly detection

Summary

Natural Language Processing

TF-IDF

Summary

Temporal and Sequential Pattern Discovery

Association rules

Apriori algorithm

Frequent pattern growth

Summary

Probabilistic Graphical Models

Key concepts

Bayes rule

Bayes network

Summary

Selected Topics in Deep Learning

Deep neural networks

Backward propagation

Forward propagation equation

Backward propagation equation

Parameters and hyperparameters

Bias initialization

Generative adversarial networks

Hinton's Capsule network

Summary

Causal Inference

Granger causality

F-test

Graphical causal models

Summary

Advanced Methods

Introduction

Kernel PCA

Independent component analysis

Compressed sensing

Self-organizing maps

Bayesian multiple imputation

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Receiver operating characteristic curve

We have come across many budding data scientists who would build a model and, in the name of evaluation, are just content with the overall accuracy. However, that's not the correct way to go about evaluating a model. For example, let's say there's a dataset that has got a response variable that has two categories: customers willing to buy the product and customers not willing to buy the product. Let's say that the dataset has 95% of customers not willing to buy the product and 5% of customers willing to buy it. Let's say that the classifier is able to correctly predict the majority class and not the minority class. So, if there are 100 observations, TP=0, TN= 95, and the rest misclassified, this will still result in 95% accuracy. However, it won't be right to conclude that this is a good model as it's not able to classify the minority class at all.

Hence, we need to look beyond accuracy so that we have a better judgement about the model. In this situation, Recall, Specificity, Precision, and the receiver operating characteristic (ROC) curve come to rescue. We learned about Recall, specificity, and precision in the previous section. Now, let's understand what the ROC curve is.

Most of the classifiers produce a score between 0 and 1. The next step occurs when we're setting up the threshold, and, based on this threshold, the classification is decided. Typically, 0.5 is the threshold—if it's more than 0.5, it creates a class, 1, and if the threshold is less than 0.5 it falls into another class, 2:

For ROC, every point between 0.0 and 1.0 is treated as a threshold, so the line of threshold keeps on moving from 0.0 to 1.0. The threshold will result in us having a TP, TN, FP, and FN. At every threshold, the following metrics are calculated:

True Positive Rate = TP/(TP+FN)
True Negative Rate = TN/(TN + FP)
False Positive Rate = 1- True Negative Rate

The calculation of (TPR and FPR) starts from 0. When the threshold line is at 0, we will be able to classify all of the customers who are willing to buy (positive cases), whereas those who are not willing to buy will be misclassified as there will be too many false positives. This means that the threshold line will start moving toward the right from zero. As this happens, the false positive starts to decline and the true positive will continue increasing.

Finally, we will need to plot a graph of the TPR versus FPR after calculating them at every point of the threshold:

The red diagonal line represents the classification at random, that is, classification without the model. The perfect ROC curve will go along the y axis and will take the shape of an absolute triangle, which will pass through the top of the y axis.

Area under ROC

To assess the model/classifier, we need to determine the area under ROC (AUROC). The whole area of this plot is 1 as the maximum value of FPR and TPR – both are 1 here. Hence, it takes the shape of a square. The random line is positioned perfectly at 45 degrees, which partitions the whole area into two symmetrical and equilateral triangles. This means that the areas under and above the red line are 0.5. The best and perfect classifier will be the one that tries to attain the AUROC as 1. The higher the AUROC, the better the model is.

In a situation where you have got multiple classifiers, you can use AUROC to determine which is the best one among the lot.

Machine Learning Quick Reference

By : Rahul Kumar

Machine Learning Quick Reference

By: Rahul Kumar

Overview of this book

Related Content you might be interested in

Current Title:

Machine Learning Quick Reference

Practical Time Series Analysis

Ensemble Machine Learning Cookbook

Hands-On Python for Finance

Receiver operating characteristic curve

Area under ROC