Book Image

Hands-On Neural Networks

By : Leonardo De Marchi, Laura Mitchell
Book Image

Hands-On Neural Networks

By: Leonardo De Marchi, Laura Mitchell

Overview of this book

Neural networks play a very important role in deep learning and artificial intelligence (AI), with applications in a wide variety of domains, right from medical diagnosis, to financial forecasting, and even machine diagnostics. Hands-On Neural Networks is designed to guide you through learning about neural networks in a practical way. The book will get you started by giving you a brief introduction to perceptron networks. You will then gain insights into machine learning and also understand what the future of AI could look like. Next, you will study how embeddings can be used to process textual data and the role of long short-term memory networks (LSTMs) in helping you solve common natural language processing (NLP) problems. The later chapters will demonstrate how you can implement advanced concepts including transfer learning, generative adversarial networks (GANs), autoencoders, and reinforcement learning. Finally, you can look forward to further content on the latest advancements in the field of neural networks. By the end of this book, you will have the skills you need to build, train, and optimize your own neural network model that can be used to provide predictable solutions.
Table of Contents (16 chapters)
Free Chapter
1
Section 1: Getting Started
4
Section 2: Deep Learning Applications
9
Section 3: Advanced Applications

Training VAEs

When training a VAE, it is necessary to be able to calculate the relationship of each parameter in the network with respect to the overall loss. This process is called backpropagation.

Standard autoencoders use backpropagation in order to reconstruct the loss across the weights of the network. However, VAEs are not as straightforward to train, owing to the fact that the sampling operation is not differentiable: the gradients cannot be propagated from the reconstruction error:

The reparameterization trick can be used to overcome this limitation. The idea behind the reparameterization trick is to sample ε from a unit normal distribution, then shift it by the mean of the latent attribute, and scale it by the latent attributes' variance 𝜎:

Performing this operation essentially removes the sampling process from the flow of gradients, as it is now...