Book Image

Mastering Data analysis with R

By : Gergely Daróczi
Book Image

Mastering Data analysis with R

By: Gergely Daróczi

Overview of this book

Table of Contents (19 chapters)
Mastering Data Analysis with R
Credits
www.PacktPub.com
Preface

Models for count data


Logistic regression can handle only binary responses. If you have count data, such as the number of deaths or failures in a given period of time, or in a given geographical area, you can use Poisson or negative binomial regression. These data types are particularly common when working with aggregated data, which is provided as a number of events classified in different categories.

Poisson regression

Poisson regression models are generalized linear models with the logarithm as the link function, and they assume that the response has a Poisson distribution. The Poisson distribution takes only integer values. It is appropriate for count data, such as events occurring over a fixed period of time, that is, if the events are rather rare, such as a number of hard drive failures per day.

In the following example, we will use the Hard Drive Data Sets for the year of 2013. The dataset was downloaded from https://docs.backblaze.com/public/hard-drive-data/2013_data.zip, but we polished...