Book Image

Mastering Machine Learning with R

By : Cory Lesmeister
Book Image

Mastering Machine Learning with R

By: Cory Lesmeister

Overview of this book

Table of Contents (20 chapters)
Mastering Machine Learning with R
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Chapter 9. Principal Components Analysis

 

"Some people skate to the puck. I skate to where the puck is going to be."

 
 --Wayne Gretzky

This chapter is the second one where we will focus on the unsupervised learning techniques. In the prior chapter, we covered cluster analysis, which provides us with the groupings of similar observations. In this chapter, we will see how to reduce the dimensionality and improve the understanding of our data by grouping the correlated variables with Principal Components Analysis (PCA). Then, we will use the principal components in supervised learning.

In many datasets, particularly in the social sciences, you will see many variables highly correlated with each other. It may additionally suffer from high dimensionality or, as it is known, the curse of dimensionality. This is a problem because the number of samples needed to estimate a function grows exponentially with the number of input features. In such datasets, there may be the case that some variables are...