Book Image

Machine Learning with R Cookbook

By : Yu-Wei, Chiu (David Chiu)
Book Image

Machine Learning with R Cookbook

By: Yu-Wei, Chiu (David Chiu)

Overview of this book

<p>The R language is a powerful open source functional programming language. At its core, R is a statistical programming language that provides impressive tools to analyze data and create high-level graphics.</p> <p>This book covers the basics of R by setting up a user-friendly programming environment and performing data ETL in R. Data exploration examples are provided that demonstrate how powerful data visualization and machine learning is in discovering hidden relationships. You will then dive into important machine learning topics, including data classification, regression, clustering, association rule mining, and dimension reduction.</p>
Table of Contents (21 chapters)
Machine Learning with R Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Resources for R and Machine Learning
Dataset – Survival of Passengers on the Titanic
Index

Classifying data with random forest


Random forest is another useful ensemble learning method that grows multiple decision trees during the training process. Each decision tree will output its own prediction results corresponding to the input. The forest will use the voting mechanism to select the most voted class as the prediction result. In this recipe, we will illustrate how to classify data using the randomForest package.

Getting ready

In this recipe, we will continue to use the telecom churn dataset as the input data source to perform classifications with the random forest method.

How to do it...

Perform the following steps to classify data with random forest:

  1. First, you have to install and load the randomForest package;

    > install.packages("randomForest")
    > library(randomForest)
    
  2. You can then fit the random forest classifier with a training set:

    > churn.rf = randomForest(churn ~ ., data = trainset, importance = T)
    > churn.rf
    
    Call:
     randomForest(formula = churn ~ ., data = trainset...