Book Image

Deep Learning with TensorFlow 2 and Keras - Second Edition

By : Antonio Gulli, Amita Kapoor, Sujit Pal
Book Image

Deep Learning with TensorFlow 2 and Keras - Second Edition

By: Antonio Gulli, Amita Kapoor, Sujit Pal

Overview of this book

Deep Learning with TensorFlow 2 and Keras, Second Edition teaches neural networks and deep learning techniques alongside TensorFlow (TF) and Keras. You’ll learn how to write deep learning applications in the most powerful, popular, and scalable machine learning stack available. TensorFlow is the machine learning library of choice for professional applications, while Keras offers a simple and powerful Python API for accessing TensorFlow. TensorFlow 2 provides full Keras integration, making advanced machine learning easier and more convenient than ever before. This book also introduces neural networks with TensorFlow, runs through the main applications (regression, ConvNets (CNNs), GANs, RNNs, NLP), covers two working example apps, and then dives into TF in production, TF mobile, and using TensorFlow with AutoML.
Table of Contents (19 chapters)
17
Other Books You May Enjoy
18
Index

Hyperparameter tuning and AutoML

The experiments defined above give some opportunities for fine-tuning a net. However, what works for this example will not necessarily work for other examples. For a given net, there are indeed multiple parameters that can be optimized (such as the number of hidden neurons, BATCH_SIZE, number of epochs, and many more depending on the complexity of the net itself). These parameters are called "hyperparameters" to distinguish them from the parameters of the network itself, that is, the values of the weights and biases.

Hyperparameter tuning is the process of finding the optimal combination of those hyperparameters that minimize cost functions. The key idea is that if we have n hyperparameters, then we can imagine that they define a space with n dimensions and the goal is to find the point in this space that corresponds to an optimal value for the cost function. One way to achieve this goal is to create a grid in this space and systematically check the value assumed by the cost function for each grid vertex. In other words, the hyperparameters are divided into buckets and different combinations of values are checked via a brute force approach.

If you think that this process of fine-tuning the hyperparameters is manual and expensive, then you are absolutely right! However, during the last few years we have seen significant results in AutoML, a set of research techniques aiming at both automatically tuning hyperparameters and searching automatically for optimal network architecture. We will discuss more about this in Chapter 14, An introduction to AutoML.