In this recipe, we will introduce logistic regression, a basic classifier. We will also show how to perform a grid search with cross-validation.
We will apply these techniques on a Kaggle dataset where the goal is to predict survival on the Titanic based on real data.
Note
Kaggle (www.kaggle.com/competitions) hosts machine learning competitions where anyone can download a dataset, train a model, and test the predictions on the website. The author of the best model might even win a prize! It is a fun way to get started with machine learning.
Download the Titanic dataset from the book's GitHub repository at https://github.com/ipython-books/cookbook-data.
The dataset has been obtained from www.kaggle.com/c/titanic-gettingStarted.