Book Image

Deep Learning for Beginners

By : Dr. Pablo Rivas
Book Image

Deep Learning for Beginners

By: Dr. Pablo Rivas

Overview of this book

With information on the web exponentially increasing, it has become more difficult than ever to navigate through everything to find reliable content that will help you get started with deep learning. This book is designed to help you if you're a beginner looking to work on deep learning and build deep learning models from scratch, and you already have the basic mathematical and programming knowledge required to get started. The book begins with a basic overview of machine learning, guiding you through setting up popular Python frameworks. You will also understand how to prepare data by cleaning and preprocessing it for deep learning, and gradually go on to explore neural networks. A dedicated section will give you insights into the working of neural networks by helping you get hands-on with training single and multiple layers of neurons. Later, you will cover popular neural network architectures such as CNNs, RNNs, AEs, VAEs, and GANs with the help of simple examples, and learn how to build models from scratch. At the end of each chapter, you will find a question and answer section to help you test what you've learned through the course of the book. By the end of this book, you'll be well-versed with deep learning concepts and have the knowledge you need to use specific algorithms with various tools for different tasks.
Table of Contents (20 chapters)
1
Section 1: Getting Up to Speed
8
Section 2: Unsupervised Deep Learning
13
Section 3: Supervised Deep Learning

Long short-term memory models

Initially proposed by Hochreiter, Long Short-Term Memory Models (LSTMs) gained traction as an improved version of recurrent models [Hochreiter, S., et al. (1997)]. LSTMs promised to alleviate the following problems associated with traditional RNNs:

  • Vanishing gradients
  • Exploding gradients
  • The inability to remember or forget certain aspects of the input sequences

The following diagram shows a very simplified version of an LSTM. In (b), we can see the additional self-loop that is attached to some memory, and in (c), we can observe what the network looks like when unfolded or expanded:

Figure 13.6. Simplified representation of an LSTM

There is much more to the model, but the most essential elements are shown in Figure 13.6. Observe how an LSTM layer receives from the previous time step not only the previous output, but also something called state, which acts as a type of memory. In the diagram, you can see that while the current output and state are available...