Book Image

Python Deep Learning Projects

By : Matthew Lamons, Rahul Kumar, Abhishek Nagaraja
Book Image

Python Deep Learning Projects

By: Matthew Lamons, Rahul Kumar, Abhishek Nagaraja

Overview of this book

Deep learning has been gradually revolutionizing every field of artificial intelligence, making application development easier. Python Deep Learning Projects imparts all the knowledge needed to implement complex deep learning projects in the field of computational linguistics and computer vision. Each of these projects is unique, helping you progressively master the subject. You’ll learn how to implement a text classifier system using a recurrent neural network (RNN) model and optimize it to understand the shortcomings you might experience while implementing a simple deep learning system. Similarly, you’ll discover how to develop various projects, including word vector representation, open domain question answering, and building chatbots using seq-to-seq models and language modeling. In addition to this, you’ll cover advanced concepts, such as regularization, gradient clipping, gradient normalization, and bidirectional RNNs, through a series of engaging projects. By the end of this book, you will have gained knowledge to develop your own deep learning systems in a straightforward way and in an efficient way
Table of Contents (17 chapters)
8
Handwritten Digits Classification Using ConvNets

LSTM architecture

The desire to model sequential data more effectively, without the limitations of the gradient problem, led researchers to create the LSTM variant of the previous RNN model architecture. LSTM achieves better performance because it incorporates gates to control the process of memory in the cell. The following diagram shows an LSTM cell:

An LSTM unit (source: http://colah.github.io/posts/2015-08-Understanding-LSTMs)

LSTM consist of three primary elements, labeled as 1, 2, and 3 in the preceding diagram:

  1. The forget gate f(t): This gate provides the ability, in the LSTM cell architecture, to forget information that is not needed. The sigmoid activation accepts the inputs X(t) and h(t-1), and effectively decides to remove pieces of old output information by passing a 0. The output of this gate is f(t)*c(t-1).
  2. Information from the new input, X(t), that is determined...