Book Image

Machine Learning with R Cookbook

By : Yu-Wei, Chiu (David Chiu)
Book Image

Machine Learning with R Cookbook

By: Yu-Wei, Chiu (David Chiu)

Overview of this book

<p>The R language is a powerful open source functional programming language. At its core, R is a statistical programming language that provides impressive tools to analyze data and create high-level graphics.</p> <p>This book covers the basics of R by setting up a user-friendly programming environment and performing data ETL in R. Data exploration examples are provided that demonstrate how powerful data visualization and machine learning is in discovering hidden relationships. You will then dive into important machine learning topics, including data classification, regression, clustering, association rule mining, and dimension reduction.</p>
Table of Contents (21 chapters)
Machine Learning with R Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Resources for R and Machine Learning
Dataset – Survival of Passengers on the Titanic
Index

Exploring and visualizing data


After imputing the missing values, one should perform an exploratory analysis, which involves using a visualization plot and an aggregation method to summarize the data characteristics. The result helps the user gain a better understanding of the data in use. The following recipe will introduce how to use basic plotting techniques with a view to help the user with exploratory analysis.

Getting ready

This recipe needs the previous recipe to be completed by imputing the missing value in the age and Embarked attribute.

How to do it...

Perform the following steps to explore and visualize data:

  1. First, you can use a bar plot and histogram to generate descriptive statistics for each attribute, starting with passenger survival:

    > barplot(table(train.data$Survived), main="Passenger Survival",  names= c("Perished", "Survived"))
    

    Passenger survival

  2. We can generate the bar plot of passenger class:

    > barplot(table(train.data$Pclass), main="Passenger Class",  names= c("first...