Book Image

Mastering Data analysis with R

By : Gergely Daróczi
Book Image

Mastering Data analysis with R

By: Gergely Daróczi

Overview of this book

Table of Contents (19 chapters)
Mastering Data Analysis with R
Credits
www.PacktPub.com
Preface

Chapter 10. Classification and Clustering

In the previous chapter, we concentrated on how to compress information found in a number of continuous variables into a smaller set of numbers, but these statistical methods are somewhat limited when we are dealing with categorized data, for example when analyzing surveys.

Although some methods try to convert discrete variables into numeric ones, such as by using a number of dummy or indicator variables, in most cases it's simply better to think about our research design goals instead of trying to forcibly use previously learned methods in the analysis.

Note

We can replace a categorical variable with a number of dummy variables by creating a new variable for each label of the original discrete variable, and then assign 1 to the related column and 0 to all the others. Such values can be used as numeric variables in statistical analysis, especially with regression models.

When we analyze a sample and target population via categorical variables, usually...