Book Image

Mastering Data analysis with R

By : Gergely Daróczi
Book Image

Mastering Data analysis with R

By: Gergely Daróczi

Overview of this book

Table of Contents (19 chapters)
Mastering Data Analysis with R
Credits
www.PacktPub.com
Preface

Adequacy tests


The first thing you want to do, when thinking about reducing the number of dimensions or looking for latent variables in the dataset with multivariate statistical analysis, is to check whether the variables are correlated and the data is normally distributed.

Normality

The latter is often not a strict requirement. For example, the results of a PCA can be still valid and interpreted if we do not have multivariate normality; on the other hand, maximum likelihood factor analysis does have this strong assumption.

Tip

You should always use the appropriate methods to achieve your data analysis goals, based on the characteristics of your data.

Anyway, you can use (for example) qqplot to do a pair-wise comparison of variables, and qqnorm to do univariate normality tests of your variables. First, let's demonstrate this with a subset of hflights:

> library(hlfights)
> JFK <- hflights[which(hflights$Dest == 'JFK'),
+                 c('TaxiIn', 'TaxiOut')]

So we filter our dataset...