Book Image

Mastering Data analysis with R

By : Gergely Daróczi
Book Image

Mastering Data analysis with R

By: Gergely Daróczi

Overview of this book

Table of Contents (19 chapters)
Mastering Data Analysis with R
Credits
www.PacktPub.com
Preface

Logistic regression


So far, we have discussed linear regression models, an appropriate method to model continuous response variables. However, non-continuous, binary responses (such as being ill or healthy, being faithful or deciding to switch to a new job, mobile supplier or partner) are also very common. The main difference compared to the continuous case is that now we should rather model probability instead of the expected value of the response variable.

The naive solution would be to use the probability as outcome in a linear model. But the problem with this solution is that the probability should be always between 0 and 1, and this bounded range is not guaranteed at all when using a linear model. A better solution is to fit a logistic regression model, which models not only the probability but also the natural logarithm of the odds, called the logit. The logit can be any (positive or negative) number, so the problem of limited range is eliminated.

Let's have a simple example of predicting...