Book Image

Mastering Machine Learning with R - Third Edition

By : Cory Lesmeister
Book Image

Mastering Machine Learning with R - Third Edition

By: Cory Lesmeister

Overview of this book

Given the growing popularity of the R-zerocost statistical programming environment, there has never been a better time to start applying ML to your data. This book will teach you advanced techniques in ML ,using? the latest code in R 3.5. You will delve into various complex features of supervised learning, unsupervised learning, and reinforcement learning algorithms to design efficient and powerful ML models. This newly updated edition is packed with fresh examples covering a range of tasks from different domains. Mastering Machine Learning with R starts by showing you how to quickly manipulate data and prepare it for analysis. You will explore simple and complex models and understand how to compare them. You’ll also learn to use the latest library support, such as TensorFlow and Keras-R, for performing advanced computations. Additionally, you’ll explore complex topics, such as natural language processing (NLP), time series analysis, and clustering, which will further refine your skills in developing applications. Each chapter will help you implement advanced ML algorithms using real-world examples. You’ll even be introduced to reinforcement learning, along with its various use cases and models. In the concluding chapters, you’ll get a glimpse into how some of these blackbox models can be diagnosed and understood. By the end of this book, you’ll be equipped with the skills to deploy ML techniques in your own projects or at work.
Table of Contents (16 chapters)

What this book covers

Here is a list of changes compared with the second edition by chapter.

Chapter 1, Preparing and Understanding Data, covers the loading of data and demonstrates how to obtain an understanding of its structure and dimensions, as well as how to install the necessary packages.

Chapter 2, Linear Regression, contains improved code, and superior charts have been provided; other than that, it remains relatively close to the original.

Chapter 3, Logistic Regression, contains improved and streamlined code. One of my favorite techniques, multivariate adaptive regression splines, has been added. This technique performs well, handles non-linearity, and is easy to explain. It is my base model.

Chapter 4, Advanced Feature Selection in Linear Models, includes techniques not only for regression, but also for a classification problem.

Chapter 5, K-Nearest Neighbors and Support Vector Machines, includes streamlined and simplified code.

Chapter 6, Tree-Based Classification, is augmented by the addition of the very popular techniques provided by the XGBOOST package. Additionally, the technique of using a random forest as a feature selection tool is incorporated.

Chapter 7, Neural Networks and Deep Learning, has been updated with additional information on deep learning methods and includes improved code for the H2O package, including hyperparameter search.

Chapter 8, Creating Ensembles and Multiclass Methods, has completely new content, involving the utilization of several great packages.

Chapter 9, Cluster Analysis, includes the methodology for executing unsupervised learning with random forests added.

Chapter 10, Principal Component Analysis, uses a different dataset, while an out-of-sample prediction has been added.

Chapter 11, Association Analysis, explains association analysis, and applies not only to making recommendations, product placement, and promotional pricing, but can also be used in manufacturing, web usage, and healthcare.

Chapter 12, Time Series and Causality, includes a couple of additional years of climate data, along with a demonstration of different causality test methods.

Chapter 13, Text Mining, includes additional data and improved code.

Appendix, Creating a Package, includes additional data packages.