Book Image

Python Deep Learning Projects

By : Matthew Lamons, Rahul Kumar, Abhishek Nagaraja
Book Image

Python Deep Learning Projects

By: Matthew Lamons, Rahul Kumar, Abhishek Nagaraja

Overview of this book

Deep learning has been gradually revolutionizing every field of artificial intelligence, making application development easier. Python Deep Learning Projects imparts all the knowledge needed to implement complex deep learning projects in the field of computational linguistics and computer vision. Each of these projects is unique, helping you progressively master the subject. You’ll learn how to implement a text classifier system using a recurrent neural network (RNN) model and optimize it to understand the shortcomings you might experience while implementing a simple deep learning system. Similarly, you’ll discover how to develop various projects, including word vector representation, open domain question answering, and building chatbots using seq-to-seq models and language modeling. In addition to this, you’ll cover advanced concepts, such as regularization, gradient clipping, gradient normalization, and bidirectional RNNs, through a series of engaging projects. By the end of this book, you will have gained knowledge to develop your own deep learning systems in a straightforward way and in an efficient way
Table of Contents (17 chapters)
8
Handwritten Digits Classification Using ConvNets

Training the captioning model

Now, let's train the model. The first thing we need to do is to extract the features stored in the respective .npy files and then pass those features through the CNN encoder.

The encoder output, hidden state (initialized to 0) and the decoder input (which is the start token) are passed to the decoder. The decoder returns the predictions and the decoder hidden state.

The decoder hidden state is then passed back into the model and the predictions are used to calculate the loss. While training, we use the teacher forcing technique to decide the next input to the decoder.

Teacher forcing is the technique where the target word is passed as the next input to the decoder. This technique helps to learn the correct sequence or correct statistical properties for the sequence, quickly.

The final step is to calculate the gradient and apply it to the optimizer...