Book Image

Machine Learning with R Cookbook

By : Yu-Wei, Chiu (David Chiu)
Book Image

Machine Learning with R Cookbook

By: Yu-Wei, Chiu (David Chiu)

Overview of this book

<p>The R language is a powerful open source functional programming language. At its core, R is a statistical programming language that provides impressive tools to analyze data and create high-level graphics.</p> <p>This book covers the basics of R by setting up a user-friendly programming environment and performing data ETL in R. Data exploration examples are provided that demonstrate how powerful data visualization and machine learning is in discovering hidden relationships. You will then dive into important machine learning topics, including data classification, regression, clustering, association rule mining, and dimension reduction.</p>
Table of Contents (21 chapters)
Machine Learning with R Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Resources for R and Machine Learning
Dataset – Survival of Passengers on the Titanic
Index

Studying a case of linear regression on SLID data


To summarize the contents of the previous section, we explore more complex data with linear regression. In this recipe, we demonstrate how to apply linear regression to analyze the Survey of Labor and Income Dynamics (SLID) dataset.

Getting ready

Check whether the car library is installed and loaded, as it is required to access thedataset SLID.

How to do it...

Follow these steps to perform linear regression on SLID data:

  1. You can use the str function to get an overview of the data:

    > str(SLID)
    'data.frame':  7425 obs. of  5 variables:
     $ wages    : num  10.6 11 NA 17.8 NA ...
     $ education: num  15 13.2 16 14 8 16 12 14.5 15 10 ...
     $ age      : int  40 19 49 46 71 50 70 42 31 56 ...
     $ sex      : Factor w/ 2 levels "Female","Male": 2 2 2 2 2 1 1 1 2 1 ...
     $ language : Factor w/ 3 levels "English","French",..: 1 1 3 3 1 1 1 1 1 1 ..
    
  2. First, we visualize the variable wages against language, age, education, and sex:

    > par(mfrow=c(2,2))
    >...