Book Image

Machine Learning with R Quick Start Guide

By : Iván Pastor Sanz
Book Image

Machine Learning with R Quick Start Guide

By: Iván Pastor Sanz

Overview of this book

Machine Learning with R Quick Start Guide takes you on a data-driven journey that starts with the very basics of R and machine learning. It gradually builds upon core concepts so you can handle the varied complexities of data and understand each stage of the machine learning pipeline. From data collection to implementing Natural Language Processing (NLP), this book covers it all. You will implement key machine learning algorithms to understand how they are used to build smart models. You will cover tasks such as clustering, logistic regressions, random forests, support vector machines, and more. Furthermore, you will also look at more advanced aspects such as training neural networks and topic modeling. By the end of the book, you will be able to apply the concepts of machine learning, deal with data-related problems, and solve them using the powerful yet simple language that is R.
Table of Contents (9 chapters)

Testing a random forest model

A random forest is an ensemble of decision trees. In a decision tree, the training sample, which is based on the independent variables, will be split into two or more homogeneous sets. This algorithm deals with both categorical and continuous variables. The best attribute is selected using a recursive selection method and is split to form the leaf nodes. This continues until a criterion that's meant to stop the loop is met. Every tree that's created by the expansion of leaf nodes is considered to be a weak learner. This weak learner is built on top of the rows and columns of the subsets. The higher the number of trees, the lower the variance. Both classification and regression random forests calculate the average prediction of all of the trees to make a final prediction.

When a random forest is trained, some different parameters can be set...