Understanding successive halving
Successive Halving (SH) is an MFO method that is not only able to focus on a more promising hyperparameter subspace but can also allocate computational cost wisely in each trial. Unlike CFS, which utilizes all of the data in each trial, SH can utilize less data for a not-too-promising subspace while utilizing more data for a more promising subspace. It can be said that SH is a variant of CFS with a much clearer algorithm definition and is wiser in spending the computational cost. The most effective way to utilize SH as a hyperparameter tuning method is when you are working with a large model (for example, a deep neural network) and/or working with a large amount of data.
Similar to CFS, SH also utilizes grid search or random search to search for the best set of hyperparameters. At the first iteration, SH will perform a grid or random search on the whole hyperparameter space with a small amount of budget or resources, and then it will gradually increase...