Book Image

Recurrent Neural Networks with Python Quick Start Guide

By : Simeon Kostadinov
Book Image

Recurrent Neural Networks with Python Quick Start Guide

By: Simeon Kostadinov

Overview of this book

Developers struggle to find an easy-to-follow learning resource for implementing Recurrent Neural Network (RNN) models. RNNs are the state-of-the-art model in deep learning for dealing with sequential data. From language translation to generating captions for an image, RNNs are used to continuously improve results. This book will teach you the fundamentals of RNNs, with example applications in Python and the TensorFlow library. The examples are accompanied by the right combination of theoretical knowledge and real-world implementations of concepts to build a solid foundation of neural network modeling. Your journey starts with the simplest RNN model, where you can grasp the fundamentals. The book then builds on this by proposing more advanced and complex algorithms. We use them to explain how a typical state-of-the-art RNN model works. From generating text to building a language translator, we show how some of today's most powerful AI applications work under the hood. After reading the book, you will be confident with the fundamentals of RNNs, and be ready to pursue further study, along with developing skills in this exciting field.
Table of Contents (8 chapters)

What is an LSTM network?

LSTM (long short-term memory) network is an advanced RNN network that aims to solve the vanishing gradient problem and yield excellent results on longer sequences. In the previous chapter, we introduced the GRU network, which is a simpler version of LSTM. Both include memory states that determine what information should be propagated further at each timestep. The LSTM cell looks as follows:

Let's introduce the main equations that will clarify the preceding diagram. They are similar to the ones for gated recurrent units (see Chapter 3, Generating Your Own Book Chapter). Here is what happens at every given timestep, t:

 is the output gate, which determines what exactly is important for the current prediction and what information should be kept around for the future.  is called the input gate, and determines...