Book Image

Neural Network Programming with TensorFlow

By : Manpreet Singh Ghotra, Rajdeep Dua
Book Image

Neural Network Programming with TensorFlow

By: Manpreet Singh Ghotra, Rajdeep Dua

Overview of this book

If you're aware of the buzz surrounding the terms such as "machine learning," "artificial intelligence," or "deep learning," you might know what neural networks are. Ever wondered how they help in solving complex computational problem efficiently, or how to train efficient neural networks? This book will teach you just that. You will start by getting a quick overview of the popular TensorFlow library and how it is used to train different neural networks. You will get a thorough understanding of the fundamentals and basic math for neural networks and why TensorFlow is a popular choice Then, you will proceed to implement a simple feed forward neural network. Next you will master optimization techniques and algorithms for neural networks using TensorFlow. Further, you will learn to implement some more complex types of neural networks such as convolutional neural networks, recurrent neural networks, and Deep Belief Networks. In the course of the book, you will be working on real-world datasets to get a hands-on understanding of neural network programming. You will also get to train generative models and will learn the applications of autoencoders. By the end of this book, you will have a fair understanding of how you can leverage the power of TensorFlow to train neural networks of varying complexities, without any hassle. While you are learning about various neural network implementations you will learn the underlying mathematics and linear algebra and how they map to the appropriate TensorFlow constructs.
Table of Contents (17 chapters)
Title Page
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Introduction to long short term memory networks


The vanishing gradient problem has appeared as the biggest obstacle to recurrent networks.

As the straight line changes along the x axis with a slight change in the y axis, the gradient shows change in all the weights with regard to change in error. If we don't know the gradient, we will not be able to adjust the weights in a direction that will reduce the loss or error, and our neural network ceases to learn.

Long short term memories (LSTMs) are designed to overcome the vanishing gradient problem. Retaining information for a larger duration of time is effectively their implicit behavior.

In standard RNNs, the repeating cell will have an elementary structure, such as a singletanh layer:

As seen in the preceding image, LSTMs also have a chain-like structure, but the recurrent cell has a different structure:

Life cycle of LSTM

The key to LSTMs is the cell state that is like a conveyor belt. It moves down the stream with minor linear interactions. It...