#### Overview of this book

Preface
Section 1:The Methods
Free Chapter
Chapter 1: Evaluating Machine Learning Models
Chapter 2: Introducing Hyperparameter Tuning
Chapter 3: Exploring Exhaustive Search
Chapter 4: Exploring Bayesian Optimization
Chapter 5: Exploring Heuristic Search
Chapter 6: Exploring Multi-Fidelity Optimization
Section 2:The Implementation
Chapter 7: Hyperparameter Tuning via Scikit
Chapter 8: Hyperparameter Tuning via Hyperopt
Chapter 9: Hyperparameter Tuning via Optuna
Chapter 10: Advanced Hyperparameter Tuning with DEAP and Microsoft NNI
Section 3:Putting Things into Practice
Chapter 11: Understanding the Hyperparameters of Popular Algorithms
Chapter 12: Introducing Hyperparameter Tuning Decision Map
Chapter 13: Tracking Hyperparameter Tuning Experiments
Chapter 14: Conclusions and Next Steps
Other Books You May Enjoy

# Discovering repeated k-fold cross-validation

Repeated k-fold cross-validation involves simply performing the k-fold cross-validation repeatedly, N times, with different randomizations in each repetition. The final evaluation score is the average of all scores from all folds of each repetition. This strategy will increase our confidence in our model.

So, why repeat the k-fold cross-validation? Why don't we just increase the value of k in k-fold? Surely, increasing the value of k will reduce the bias of our model's estimated performance. However, increasing the value of k will increase the variation, especially when we have a small number of samples. Therefore, usually, repeating the k-folds is a better way to gain higher confidence in our model's estimated performance. Of course, this comes with a drawback, which is the increase in computation time.

To implement this strategy, we can simply perform a manual for-loop, where we apply the k-fold cross-validation strategy to each loop. Fortunately, the Scikit-Learn package provide us with a specific function in which to implement this strategy:

```from sklearn.model_selection import train_test_split, RepeatedKFold
df_cv, df_test = train_test_split(df, test_size=0.2, random_state=0)
rkf = RepeatedKFold(n_splits=4, n_repeats=3, random_state=0)
for train_index, val_index in rkf.split(df_cv):
df_train, df_val = df_cv.iloc[train_index], df_cv.iloc[val_index]
#perform training or hyperparameter tuning here```

Choosing `n_splits=4` and `n_repeats=3` means that we will have 12 different train and validation sets. The final evaluation score is then just the average of all 12 scores. As you might expect, there is also a dedicated function to implement the repeated k-fold in a stratified fashion:

```from sklearn.model_selection import train_test_split, RepeatedStratifiedKFold
df_cv, df_test = train_test_split(df, test_size=0.2, random_state=0, stratify=df['class'])
rskf = RepeatedStratifiedKFold(n_splits=4, n_repeats=3, random_state=0)
for train_index, val_index in rskf.split(df_cv, df_cv['class']):
df_train, df_val = df_cv.iloc[train_index], df_cv.iloc[val_index]
#perform training or hyperparameter tuning here```

The `RepeatedStratifiedKFold` function will perform stratified k-fold cross-validation repeatedly, `n_repeats` times.

Now that you have learned another variation of the cross-validation strategy, called repeated k-fold cross-validation, let's learn about the other variations next.