Sign In Start Free Trial

Book Overview & Buying
Table Of Contents

Mastering Machine Learning with R - Second Edition

By : Cory Lesmeister, Doug Ortiz , Vikram Dhillon, Miroslav Kopecky

2.8 (4)

Mastering Machine Learning with R

2.8 (4)

By: Cory Lesmeister, Doug Ortiz , Vikram Dhillon, Miroslav Kopecky

Overview of this book

This book will teach you advanced techniques in machine learning with the latest code in R 3.3.2. You will delve into statistical learning theory and supervised learning; design efficient algorithms; learn about creating Recommendation Engines; use multi-class classification and deep learning; and more. You will explore, in depth, topics such as data mining, classification, clustering, regression, predictive modeling, anomaly detection, boosted trees with XGBOOST, and more. More than just knowing the outcome, you’ll understand how these concepts work and what they do. With a slow learning curve on topics such as neural networks, you will explore deep learning, and more. By the end of this book, you will be able to perform machine learning with R in the cloud using AWS in various scenarios with different datasets.

Preface

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Free Chapter

A Process for Success

A Process for Success

The process

Business understanding

Data understanding

Data preparation

Modeling

Evaluation

Deployment

Algorithm flowchart

Summary

Linear Regression - The Blocking and Tackling of Machine Learning

Linear Regression - The Blocking and Tackling of Machine Learning

Univariate linear regression

Multivariate linear regression

Other linear model considerations

Summary

Logistic Regression and Discriminant Analysis

Logistic Regression and Discriminant Analysis

Classification methods and linear regression

Logistic regression

Discriminant analysis overview

Multivariate Adaptive Regression Splines (MARS)

Model selection

Summary

Advanced Feature Selection in Linear Models

Advanced Feature Selection in Linear Models

Regularization in a nutshell

Business case

Modeling and evaluation

Model selection

Regularization and classification

Summary

More Classification Techniques - K-Nearest Neighbors and Support Vector Machines

More Classification Techniques - K-Nearest Neighbors and Support Vector Machines

K-nearest neighbors

Support vector machines

Business case

Feature selection for SVMs

Summary

Classification and Regression Trees

Classification and Regression Trees

An overview of the techniques

Business case

Summary

Neural Networks and Deep Learning

Neural Networks and Deep Learning

Introduction to neural networks

Deep learning, a not-so-deep overview

Business understanding

Data understanding and preparation

Modeling and evaluation

An example of deep learning

Summary

Cluster Analysis

Cluster Analysis

Hierarchical clustering

K-means clustering

Gower and partitioning around medoids

Random forest

Business understanding

Data understanding and preparation

Modeling and evaluation

Summary

Principal Components Analysis

Principal Components Analysis

An overview of the principal components

Business understanding

Modeling and evaluation

Summary

Market Basket Analysis, Recommendation Engines, and Sequential Analysis

Market Basket Analysis, Recommendation Engines, and Sequential Analysis

An overview of a market basket analysis

Business understanding

Data understanding and preparation

Modeling and evaluation

An overview of a recommendation engine

Business understanding and recommendations

Data understanding, preparation, and recommendations

Modeling, evaluation, and recommendations

Sequential data analysis

Summary

Creating Ensembles and Multiclass Classification

Creating Ensembles and Multiclass Classification

Ensembles

Business and data understanding

Modeling evaluation and selection

Multiclass classification

Business and data understanding

Model evaluation and selection

MLR's ensemble

Summary

Time Series and Causality

Time Series and Causality

Univariate time series analysis

Business understanding

Modeling and evaluation

Summary

Text Mining

Text Mining

Text mining framework and methods

Topic models

Business understanding

Modeling and evaluation

Summary

R on the Cloud

R on the Cloud

Creating an Amazon Web Services account

Summary

R Fundamentals

R Fundamentals

Getting R up-and-running

Using R

Data frames and matrices

Creating summary statistics

Installing and loading R packages

Data manipulation with dplyr

Summary

Sources

Sources

A Process for Success

"If you don't know where you are going, any road will get you there."
- Robert Carrol

"If you can't describe what you are doing as a process, you don't know what you're doing."
- W. Edwards Deming

At first glance, this chapter may seem to have nothing to do with machine learning, but it has everything to do with machine learning (specifically, its implementation and making change happen). The smartest people, best software, and best algorithms do not guarantee success, no matter how well it is defined.

In most, if not all, projects, the key to successfully solving problems or improving decision-making is not the algorithm, but the softer, more qualitative skills of communication and influence. The problem many of us have with this is that it is hard to quantify how effective one is around these skills. It is probably safe to say that many of us ended up in this position because of a desire to avoid it. After all, the highly successful TV comedy The Big Bang Theory was built on this premise. Therefore, the goal of this chapter is to set you up for success. The intent is to provide a process, a flexible process no less, where you can become a change agent: a person who can influence and turn their insights into action without positional power. We will focus on Cross-Industry Standard Process for Data Mining (CRISP-DM). It is probably the most well-known and respected of all processes for analytical projects. Even if you use another industry process or something proprietary, there should still be a few gems in this chapter that you can take away.

I will not hesitate to say that this all is easier said than done; without question, I'm guilty of every sin (both commission and omission) that will be discussed in this chapter. With skill and some luck, you can avoid the many physical and emotional scars I've picked up over the last 12 years.

Finally, we will also have a look at a flow chart (a cheat sheet) that you can use to help you identify what methodologies to apply to the problem at hand.

Visually different images

CONTINUE READING

83

Tech Concepts

36

Programming languages

73

Tech Tools

Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

50+ new titles added per month and exclusive early access to books as they are being written.

Mastering Machine Learning with R

Search

Your notes and bookmarks