Book Image

Deep Learning with PyTorch Lightning

By : Kunal Sawarkar
3.5 (2)
Book Image

Deep Learning with PyTorch Lightning

3.5 (2)
By: Kunal Sawarkar

Overview of this book

Building and implementing deep learning (DL) is becoming a key skill for those who want to be at the forefront of progress.But with so much information and complex study materials out there, getting started with DL can feel quite overwhelming. Written by an AI thought leader, Deep Learning with PyTorch Lightning helps researchers build their first DL models quickly and easily without getting stuck on the complexities. With its help, you’ll be able to maximize productivity for DL projects while ensuring full flexibility – from model formulation to implementation. Throughout this book, you’ll learn how to configure PyTorch Lightning on a cloud platform, understand the architectural components, and explore how they are configured to build various industry solutions. You’ll build a neural network architecture, deploy an application from scratch, and see how you can expand it based on your specific needs, beyond what the framework can provide. In the later chapters, you’ll also learn how to implement capabilities to build and train various models like Convolutional Neural Nets (CNN), Natural Language Processing (NLP), Time Series, Self-Supervised Learning, Semi-Supervised Learning, Generative Adversarial Network (GAN) using PyTorch Lightning. By the end of this book, you’ll be able to build and deploy DL models with confidence.
Table of Contents (15 chapters)
1
Section 1: Kickstarting with PyTorch Lightning
6
Section 2: Solving using PyTorch Lightning
11
Section 3: Advanced Topics

Going through the CNN–RNN architecture

While there are many possible applications of semi-supervised learning and a number of possible neural architectures, we will start with one of the most popular, which is an architecture that combines CNN and RNN.

Simply put, we will be starting with an image, then use the CNN to recognize the image, and then pass the output of the CNN to an RNN, which in turn generates the text:

Figure 7.2 – CNN–RNN cascaded architecture

Intuitively speaking, the model is trained to recognize the images and their sentence descriptions so that it learns about the intermodal correspondence between language and visual data. It uses a CNN and a multimodal RNN to generate descriptions of the images. As mentioned above, LSTM is used for the implementation of the RNN.

This architecture was first proposed by Andrej Karpathy and his doctoral advisor Fei-Fei Li in their 2015 Stanford paper titled Generative Text Using...