We'll use the Titanic dataset, which was utilized in Chapter 3, Finding a Needle in a Haystack, to help us build the logistic regression model. Since we have already explored the data, we won't be performing any exploratory data analysis as we already have a context for this data.
This is a recap of the field descriptions of the Titanic dataset:
Survival: This refers to the survival of the passengers (
0
= No and1
= Yes)Pclass: This refers to the passenger class (
1
= 1st,2
= 2nd, and3
= 3rd)Name: This refers to the names of the passengers
Sex: This refers to the gender of the passengers
Age: This refers to the age of the passengers
Sibsp: This refers to the number of siblings/spouses aboard
Parch: This refers to the number of parents/children aboard
Ticket: This refers to the ticket number
Fare: This refers to the passenger fares
Cabin: This refers to the cabin
Embarked: This refers to the port of embarkation (C = Cherbourg, Q = Queenstown, and S = Southampton)