Creating a baseline machine learning pipeline
In previous chapters, we offered to you, the reader, a single machine learning model to use throughout the chapter. In this chapter, we will do some work to find the best machine learning model for our needs and then work to enhance that model with feature selection. We will begin by importing four different machine learning models:
- Logistic Regression
- K-Nearest Neighbors
- Decision Tree
- Random Forest
The code for importing the learning models is given as follows:
# Import four machine learning models from sklearn.linear_model import LogisticRegression from sklearn.neighbors import KNeighborsClassifier from sklearn.tree import DecisionTreeClassifier from sklearn.ensemble import RandomForestClassifier
Once we are finished importing these modules, we will run them through our get_best_model_
and_accuracy
functions to get a baseline on how each one handles the raw data. We will have to first establish some variables to do so. We will use the following code...