Book Image

Data Analysis with STATA

Book Image

Data Analysis with STATA

Overview of this book

Table of Contents (16 chapters)
Data Analysis with Stata
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Variance inflation factor and multicollinearity


What if your independent variables are related to each other, for example, the date of birth and age? Both variables are related to each other or can be derived with one variable. In this case, the regression equation will have an additive effect due to similarities between the variables; the value of the predicted values can be inflated. This condition is called multicollinearity. It can be treated using variance inflation factor (VIF) The VIF for the given variable indicates how correlated it is compared to other variables. The preceding VIF cutoffs are considered to be multicollinear, which are set at industry level. Healthcare and marketing data generally have a cutoff of 3. Each variable that has a VIF higher than 3 is considered to be multicollinear and is dropped from the model. In the case of multicollinearity, coefficients of the variables become unstable and standard errors are inflated.

Here is the Stata code to detect multicollinearity...