Book Image

Mastering Machine Learning with R

By : Cory Lesmeister
Book Image

Mastering Machine Learning with R

By: Cory Lesmeister

Overview of this book

Table of Contents (20 chapters)
Mastering Machine Learning with R
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Business case


The overall business objective in this situation is to see if we can improve the predictive ability for some of the cases that we already worked on in the previous chapters. For regression, we will revisit the prostate cancer dataset from Chapter 4, Advanced Feature Selection in Linear Models. The baseline mean squared error to improve on is 0.444.

For classification purposes, we will utilize both the breast cancer biopsy data from Chapter 3, Logistic Regression and Discriminant Analysis and the Pima Indian Diabetes data from Chapter 5, More Classification Techniques — K-Nearest Neighbors and Support Vector Machines. In the breast cancer data, we achieved 97.6 percent predictive accuracy. For the diabetes data, we are seeking to improve on the 79.6 percent accuracy rate.

Both random forests and boosting will be applied to all three datasets. The simple tree method will only be used on the breast and prostate cancer sets from Chapter 4, Advanced Feature Selection in Linear Models...