Practical Machine Learning with R

Practical Machine Learning with R

By : Brindha Priyadarshini Jeyaraman, Ludvig Renbo Olsen, Monicah Wambugu

Buy this Book

Practical Machine Learning with R

By: Brindha Priyadarshini Jeyaraman, Ludvig Renbo Olsen, Monicah Wambugu

Buy this Book

Overview of this book

With huge amounts of data being generated every moment, businesses need applications that apply complex mathematical calculations to data repeatedly and at speed. With machine learning techniques and R, you can easily develop these kinds of applications in an efficient way. Practical Machine Learning with R begins by helping you grasp the basics of machine learning methods, while also highlighting how and why they work. You will understand how to get these algorithms to work in practice, rather than focusing on mathematical derivations. As you progress from one chapter to another, you will gain hands-on experience of building a machine learning solution in R. Next, using R packages such as rpart, random forest, and multiple imputation by chained equations (MICE), you will learn to implement algorithms including neural net classifier, decision trees, and linear and non-linear regression. As you progress through the book, you’ll delve into various machine learning techniques for both supervised and unsupervised learning approaches. In addition to this, you’ll gain insights into partitioning the datasets and mechanisms to evaluate the results from each model and be able to compare them. By the end of this book, you will have gained expertise in solving your business problems, starting by forming a good problem statement, selecting the most appropriate model to solve your problem, and then ensuring that you do not overtrain it.

About the Book

Minimum Hardware Requirements

Software Requirements

Conventions

Installation and Setup

Installing R

Installing R Studio

Installing Libraries

Installing the Code Bundle

Additional Resources

Free Chapter

An Introduction to Machine Learning

Introduction

The Machine Learning Process

Introduction to R

Machine Learning Models

Regression

Summary

Data Cleaning and Pre-processing

Introduction

Advanced Operations on Data Frames

Identifying the Input and Output Variables

Identifying the Category of Prediction

Handling Missing Values, Duplicates, and Outliers

Handling Outliers

Summary

Feature Engineering

Introduction

Types of Features

Time Series Features

Handling Categorical Variables

Derived Features or Domain-Specific Features

Adding Features to a Data Frame

Handling Redundant Features

Feature Selection

Summary

Introduction to neuralnet and Evaluation Methods

Introduction

Classification

Model Selection

Multiclass Classification Overview

Summary

Linear and Logistic Regression Models

Regression and Classification with Decision Trees

Model Selection by Multiple Disagreeing Metrics

Summary

Unsupervised Learning

Introduction

Overview of Unsupervised Learning (Clustering)

DIANA

Applications of Clustering

k-means Clustering

Summary

Appendix

Chapter 1: An Introduction to Machine Learning

Chapter 2: Data Cleaning and Pre-processing

Chapter 3: Feature Engineering

Chapter 4: Introduction to neuralnet and Evaluation Methods

Chapter 5: Linear and Logistic Regression Models

Chapter 6: Unsupervised Learning

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Introduction

Data cleaning and preparation takes about 70% of the effort in the entire process of a machine learning project. This step is essential because the quality of the data determines the accuracy of the prediction model. A clean dataset should contain good samples of the scenarios that we want to predict, and this will give us good prediction results. Also, the data should be balanced, which means that every category we want to predict should have similar number of samples. For example, if we want to predict whether it will rain or not on any particular day, and if the sample data size is 100, the data could contain 40 samples for It will rain and 60 samples for It will not rain today, or vice versa. However, if the ratio is 20:80 or 30:70, it is an unbalanced dataset, and this will not yield good results for the minority class.

In the following section, we will look at the essential operations performed on data frames in R. These operations will help us to manipulate and analyze...

Practical Machine Learning with R

By : Brindha Priyadarshini Jeyaraman, Ludvig Renbo Olsen, Monicah Wambugu

Practical Machine Learning with R

By: Brindha Priyadarshini Jeyaraman, Ludvig Renbo Olsen, Monicah Wambugu

Overview of this book

Related Content you might be interested in

Current Title:

Practical Machine Learning with R

Applied Supervised Learning with R

Machine Learning with R Cookbook

Regression Analysis with R

Introduction