Logistic regression models allow us to fit a regression model to categorical data. Here, we will look at the survival rates of passengers on the Titanic. This data is binomial in that we list the survivors or casualties of the disaster.
We will initially recode the Survival
columns to state 1
as survived and 0
as casualty. Then, we study the effects of age, class, and gender on the chances of survival. This isn't essential but can be a useful aid to the interpretation of the results.
The final steps will store the event probability calculated from the fitted model to plot the results in a scatterplot.
The data is contained in test format at the StatSci website. The direct link to the Titanic data is as follows:
http://www.statsci.org/data/general/titanic.txt
The data will copy and paste directly into Minitab. The data columns are listed as Passenger class
, Age
, Gender
, and Survival
.
Do make sure that you check the dataset, as a couple of passenger details...