Hyperparameter Tuning with Python

By : Louis Owen

Hyperparameter Tuning with Python

By: Louis Owen

Overview of this book

Hyperparameters are an important element in building useful machine learning models. This book curates numerous hyperparameter tuning methods for Python, one of the most popular coding languages for machine learning. Alongside in-depth explanations of how each method works, you will use a decision map that can help you identify the best tuning method for your requirements. You’ll start with an introduction to hyperparameter tuning and understand why it's important. Next, you'll learn the best methods for hyperparameter tuning for a variety of use cases and specific algorithm types. This book will not only cover the usual grid or random search but also other powerful underdog methods. Individual chapters are also dedicated to the three main groups of hyperparameter tuning methods: exhaustive search, heuristic search, Bayesian optimization, and multi-fidelity optimization. Later, you will learn about top frameworks like Scikit, Hyperopt, Optuna, NNI, and DEAP to implement hyperparameter tuning. Finally, you will cover hyperparameters of popular algorithms and best practices that will help you efficiently tune your hyperparameter. By the end of this book, you will have the skills you need to take full control over your machine learning models and get the best models for the best results.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Share Your Thoughts

Section 1:The Methods

Free Chapter

Chapter 1: Evaluating Machine Learning Models

Technical requirements

Understanding the concept of overfitting

Creating training, validation, and test sets

Exploring random and stratified splits

Discovering repeated k-fold cross-validation

Discovering Leave-One-Out cross-validation

Discovering LPO cross-validation

Discovering time-series cross-validation

Summary

Further reading

Chapter 2: Introducing Hyperparameter Tuning

What is hyperparameter tuning?

Demystifying hyperparameters versus parameters

Understanding hyperparameter space  and distributions

Summary

Chapter 3: Exploring Exhaustive Search

Understanding manual search

Understanding grid search

Understanding random search

Summary

Chapter 4: Exploring Bayesian Optimization

Summary

Chapter 5: Exploring Heuristic Search

Understanding simulated annealing

Understanding genetic algorithms

Understanding particle swarm optimization

Understanding Population-Based Training

Summary

Chapter 6: Exploring Multi-Fidelity Optimization

Introducing MFO

Understanding coarse-to-fine search

Understanding successive halving

Understanding hyper band

Understanding BOHB

Summary

Section 2:The Implementation

Chapter 7: Hyperparameter Tuning via Scikit

Technical requirements

Introducing Scikit

Implementing Grid Search

Implementing Random Search

Implementing Coarse-to-Fine Search

Implementing Successive Halving

Implementing Hyper Band

Implementing Bayesian Optimization Gaussian Process

Implementing Bayesian Optimization Random Forest

Implementing Bayesian Optimization Gradient Boosted Trees

Summary

Chapter 8: Hyperparameter Tuning via Hyperopt

Technical requirements

Introducing Hyperopt

Implementing Random Search

Implementing Tree-structured Parzen Estimators

Implementing Adaptive TPE

Implementing simulated annealing

Summary

Chapter 9: Hyperparameter Tuning via Optuna

Technical requirements

Introducing Optuna

Implementing TPE

Implementing Random Search

Implementing Grid Search

Implementing Simulated Annealing

Implementing Successive Halving

Implementing Hyperband

Summary

Chapter 10: Advanced Hyperparameter Tuning with DEAP and Microsoft NNI

Technical requirements

Introducing DEAP

Implementing the Genetic Algorithm

Implementing Particle Swarm Optimization

Introducing Microsoft NNI

Implementing Grid Search

Implementing Random Search

Implementing Tree-structured Parzen Estimators

Implementing Sequential Model Algorithm Configuration

Implementing Bayesian Optimization Gaussian Process

Implementing Metis

Implementing Simulated Annealing

Implementing Hyper Band

Implementing Bayesian Optimization Hyper Band

Implementing Population-Based Training

Summary

Section 3:Putting Things into Practice

Chapter 11: Understanding the Hyperparameters of Popular Algorithms

Exploring Random Forest hyperparameters

Exploring XGBoost hyperparameters

Exploring LightGBM hyperparameters

Exploring CatBoost hyperparameters

Exploring SVM hyperparameters

Exploring artificial neural network hyperparameters

Summary

Chapter 12: Introducing Hyperparameter Tuning Decision Map

Getting familiar with HTDM

Case study 1 – using HTDM with a CatBoost classifier

Case study 2 – using HTDM with a conditional hyperparameter space

Case study 3 – using HTDM with prior knowledge of the hyperparameter values

Summary

Chapter 13: Tracking Hyperparameter Tuning Experiments

Technical requirements

Revisiting the usual practices

Exploring Neptune

Exploring scikit-optimize

Exploring Optuna

Exploring Microsoft NNI

Exploring MLflow

Summary

Chapter 14: Conclusions and Next Steps

Revisiting hyperparameter tuning methods and packages

Revisiting HTDM

What’s next?

Summary

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Discovering repeated k-fold cross-validation

Repeated k-fold cross-validation involves simply performing the k-fold cross-validation repeatedly, N times, with different randomizations in each repetition. The final evaluation score is the average of all scores from all folds of each repetition. This strategy will increase our confidence in our model.

So, why repeat the k-fold cross-validation? Why don't we just increase the value of k in k-fold? Surely, increasing the value of k will reduce the bias of our model's estimated performance. However, increasing the value of k will increase the variation, especially when we have a small number of samples. Therefore, usually, repeating the k-folds is a better way to gain higher confidence in our model's estimated performance. Of course, this comes with a drawback, which is the increase in computation time.

To implement this strategy, we can simply perform a manual for-loop, where we apply the k-fold cross-validation strategy to each loop. Fortunately, the Scikit-Learn package provide us with a specific function in which to implement this strategy:

from sklearn.model_selection import train_test_split, RepeatedKFold
df_cv, df_test = train_test_split(df, test_size=0.2, random_state=0)
rkf = RepeatedKFold(n_splits=4, n_repeats=3, random_state=0)
for train_index, val_index in rkf.split(df_cv):
df_train, df_val = df_cv.iloc[train_index], df_cv.iloc[val_index]
#perform training or hyperparameter tuning here

Choosing n_splits=4 and n_repeats=3 means that we will have 12 different train and validation sets. The final evaluation score is then just the average of all 12 scores. As you might expect, there is also a dedicated function to implement the repeated k-fold in a stratified fashion:

from sklearn.model_selection import train_test_split, RepeatedStratifiedKFold
df_cv, df_test = train_test_split(df, test_size=0.2, random_state=0, stratify=df['class'])
rskf = RepeatedStratifiedKFold(n_splits=4, n_repeats=3, random_state=0)
for train_index, val_index in rskf.split(df_cv, df_cv['class']):
df_train, df_val = df_cv.iloc[train_index], df_cv.iloc[val_index]
#perform training or hyperparameter tuning here

The RepeatedStratifiedKFold function will perform stratified k-fold cross-validation repeatedly, n_repeats times.

Now that you have learned another variation of the cross-validation strategy, called repeated k-fold cross-validation, let's learn about the other variations next.

Hyperparameter Tuning with Python

By : Louis Owen

Hyperparameter Tuning with Python

By: Louis Owen

Overview of this book

Related Content you might be interested in

Current Title:

Hyperparameter Tuning with Python

Machine Learning with LightGBM and Python

Hands-On Genetic Algorithms with Python

A Handbook of Mathematical Models with Python

Discovering repeated k-fold cross-validation