Book Image

Machine Learning with R - Fourth Edition

By : Brett Lantz

5 (1)

Book Image

Machine Learning with R - Fourth Edition

5 (1)

By: Brett Lantz

Overview of this book

Dive into R with this data science guide on machine learning (ML). Machine Learning with R, Fourth Edition, takes you through classification methods like nearest neighbor and Naive Bayes and regression modeling, from simple linear to logistic. Dive into practical deep learning with neural networks and support vector machines and unearth valuable insights from complex data sets with market basket analysis. Learn how to unlock hidden patterns within your data using k-means clustering. With three new chapters on data, you’ll hone your skills in advanced data preparation, mastering feature engineering, and tackling challenging data scenarios. This book helps you conquer high-dimensionality, sparsity, and imbalanced data with confidence. Navigate the complexities of big data with ease, harnessing the power of parallel computing and leveraging GPU resources for faster insights. Elevate your understanding of model performance evaluation, moving beyond accuracy metrics. With a new chapter on building better learners, you’ll pick up techniques that top teams use to improve model performance with ensemble methods and innovative model stacking and blending techniques. Machine Learning with R, Fourth Edition, equips you with the tools and knowledge to tackle even the most formidable data challenges. Unlock the full potential of machine learning and become a true master of the craft.

Preface

Who this book is for

What this book covers

What you need for this book

Introducing Machine Learning

Introducing Machine Learning

The origins of machine learning

Uses and abuses of machine learning

How machines learn

Machine learning in practice

Machine learning with R

Free Chapter

Managing and Understanding Data

Managing and Understanding Data

R data structures

Managing data with R

Exploring and understanding data

Lazy Learning – Classification Using Nearest Neighbors

Lazy Learning – Classification Using Nearest Neighbors

Understanding nearest neighbor classification

Example – diagnosing breast cancer with the k-NN algorithm

Probabilistic Learning – Classification Using Naive Bayes

Probabilistic Learning – Classification Using Naive Bayes

Understanding Naive Bayes

Example – filtering mobile phone spam with the Naive Bayes algorithm

Divide and Conquer – Classification Using Decision Trees and Rules

Divide and Conquer – Classification Using Decision Trees and Rules

Understanding decision trees

Example – identifying risky bank loans using C5.0 decision trees

Understanding classification rules

Example – identifying poisonous mushrooms with rule learners

Forecasting Numeric Data – Regression Methods

Forecasting Numeric Data – Regression Methods

Understanding regression

Example – predicting auto insurance claims costs using linear regression

Understanding regression trees and model trees

Example – estimating the quality of wines with regression trees and model trees

Black-Box Methods – Neural Networks and Support Vector Machines

Black-Box Methods – Neural Networks and Support Vector Machines

Understanding neural networks

Example – modeling the strength of concrete with ANNs

Understanding support vector machines

Example – performing OCR with SVMs

Finding Patterns – Market Basket Analysis Using Association Rules

Finding Patterns – Market Basket Analysis Using Association Rules

Understanding association rules

Example – identifying frequently purchased groceries with association rules

Finding Groups of Data – Clustering with k-means

Finding Groups of Data – Clustering with k-means

Understanding clustering

Finding teen market segments using k-means clustering

Evaluating Model Performance

Evaluating Model Performance

Measuring performance for classification

Estimating future performance

Being Successful with Machine Learning

Being Successful with Machine Learning

What makes a successful machine learning practitioner?

What makes a successful machine learning model?

Putting the “science” in data science

Advanced Data Preparation

Advanced Data Preparation

Performing feature engineering

Feature engineering in practice

Exploring R’s tidyverse

Challenging Data – Too Much, Too Little, Too Complex

Challenging Data – Too Much, Too Little, Too Complex

The challenge of high-dimension data

Making use of sparse data

Handling missing data

The problem of imbalanced data

Building Better Learners

Building Better Learners

Tuning stock models for better performance

Improving model performance with ensembles

Stacking models for meta-learning

Making Use of Big Data

Making Use of Big Data

Practical applications of deep learning

Unsupervised learning and big data

Adapting R to handle large datasets

Other Books You May Enjoy

Other Books You May Enjoy

Index

Customer Reviews

5 (1)

5 star

100%

4 star

0

3 star

0

2 star

0

1 star

0

Measuring performance for classification

In the previous chapters, we measured classifier accuracy by dividing the number of correct predictions by the total number of predictions. This finds the proportion of cases in which the learner is correct, and the proportion of incorrect cases follows directly. For example, suppose that a classifier correctly predicted whether newborn babies were a carrier of a treatable but potentially fatal genetic defect in 99,990 out of 100,000 cases. This would imply an accuracy of 99.99 percent and an error rate of only 0.01 percent.

At first glance, this appears to be an extremely valuable classifier. However, it would be wise to collect additional information before trusting a child’s life to the test. What if the genetic defect is found in only 10 out of every 100,000 babies? A test that invariably predicts no defect will be correct for 99.99 percent of all cases, but incorrect for 100 percent of the cases that matter most. In other words...