Book Image

The Deep Learning with PyTorch Workshop

By : Hyatt Saleh
Book Image

The Deep Learning with PyTorch Workshop

By: Hyatt Saleh

Overview of this book

Want to get to grips with one of the most popular machine learning libraries for deep learning? The Deep Learning with PyTorch Workshop will help you do just that, jumpstarting your knowledge of using PyTorch for deep learning even if you’re starting from scratch. It’s no surprise that deep learning’s popularity has risen steeply in the past few years, thanks to intelligent applications such as self-driving vehicles, chatbots, and voice-activated assistants that are making our lives easier. This book will take you inside the world of deep learning, where you’ll use PyTorch to understand the complexity of neural network architectures. The Deep Learning with PyTorch Workshop starts with an introduction to deep learning and its applications. You’ll explore the syntax of PyTorch and learn how to define a network architecture and train a model. Next, you’ll learn about three main neural network architectures - convolutional, artificial, and recurrent - and even solve real-world data problems using these networks. Later chapters will show you how to create a style transfer model to develop a new image from two images, before finally taking you through how RNNs store memory to solve key data issues. By the end of this book, you’ll have mastered the essential concepts, tools, and libraries of PyTorch to develop your own deep neural networks and intelligent apps.
Table of Contents (8 chapters)

Long Short-Term Memory Networks

As we mentioned previously, RNNs store short-term memory only. This is an issue when dealing with long sequences of data, where the network will have trouble carrying the information from the earlier steps to the final ones.

For instance, take the poem "The Raven," which was written by the famous poet Edgar Alan Poe and is over 1,000 words long. Attempting to process it using a traditional RNN, with the objective of creating a similar related poem, will result in the model leaving out crucial information from the first couple of paragraphs. This, in turn, may result in an output that is unrelated to the initial subject of the poem. For instance, it could ignore that the event occurred at night, and so make the new poem not very scary.

This inability to hold long-term memory occurs because traditional RNNs suffer from a problem called vanishing gradients. This occurs when the gradients, which are used to update the parameters of the network...