Book Image

Hyperparameter Tuning with Python

By : Louis Owen
Book Image

Hyperparameter Tuning with Python

By: Louis Owen

Overview of this book

Hyperparameters are an important element in building useful machine learning models. This book curates numerous hyperparameter tuning methods for Python, one of the most popular coding languages for machine learning. Alongside in-depth explanations of how each method works, you will use a decision map that can help you identify the best tuning method for your requirements. You’ll start with an introduction to hyperparameter tuning and understand why it's important. Next, you'll learn the best methods for hyperparameter tuning for a variety of use cases and specific algorithm types. This book will not only cover the usual grid or random search but also other powerful underdog methods. Individual chapters are also dedicated to the three main groups of hyperparameter tuning methods: exhaustive search, heuristic search, Bayesian optimization, and multi-fidelity optimization. Later, you will learn about top frameworks like Scikit, Hyperopt, Optuna, NNI, and DEAP to implement hyperparameter tuning. Finally, you will cover hyperparameters of popular algorithms and best practices that will help you efficiently tune your hyperparameter. By the end of this book, you will have the skills you need to take full control over your machine learning models and get the best models for the best results.
Table of Contents (19 chapters)
Section 1:The Methods
Section 2:The Implementation
Section 3:Putting Things into Practice

Chapter 1: Evaluating Machine Learning Models

Machine Learning (ML) models need to be thoroughly evaluated to ensure they will work in production. We have to ensure the model is not memorizing the training data and also ensure it learns enough from the given training data. Choosing the appropriate evaluation method is also critical when we want to perform hyperparameter tuning at a later stage.

In this chapter, we'll learn about all the important things we need to know when it comes to evaluating ML models. First, we need to understand the concept of overfitting. Then, we will look at the idea of splitting data into train, validation, and test sets. Additionally, we'll learn about the difference between random and stratified splits and when to use each of them.

We'll discuss the concept of cross-validation and its numerous variations of strategy: k-fold repeated k-fold, Leave One Out (LOO), Leave P Out (LPO), and a specific strategy when dealing with time-series data, called time-series cross-validation. We'll also learn how to implement each of the evaluation strategies using the Scikit-Learn package.

By the end of this chapter, you will have a good understanding of why choosing a proper evaluation strategy is critical in the ML model development life cycle. Also, you will be aware of numerous evaluation strategies and will be able to choose the most appropriate one for your situation. Furthermore, you will also be able to implement each of the evaluation strategies using the Scikit-Learn package.

In this chapter, we're going to cover the following main topics:

  • Understanding the concept of overfitting
  • Creating training, validation, and test sets
  • Exploring random and stratified split
  • Discovering k-fold cross-validation
  • Discovering repeated k-fold cross-validation
  • Discovering LOO cross-validation
  • Discovering LPO cross-validation
  • Discovering time-series cross-validation