Foundation Models in Computer Vision | Modern Computer Vision with PyTorch

Book Overview & Buying
Table Of Contents

Modern Computer Vision with PyTorch - Second Edition

By : V Kishore Ayyadevara, Yeshwanth Reddy

4 (20)

Buy this Book

Modern Computer Vision with PyTorch

4 (20)

By: V Kishore Ayyadevara, Yeshwanth Reddy

Buy this Book

Overview of this book

Whether you are a beginner or are looking to progress in your computer vision career, this book guides you through the fundamentals of neural networks (NNs) and PyTorch and how to implement state-of-the-art architectures for real-world tasks. The second edition of Modern Computer Vision with PyTorch is fully updated to explain and provide practical examples of the latest multimodal models, CLIP, and Stable Diffusion. You’ll discover best practices for working with images, tweaking hyperparameters, and moving models into production. As you progress, you'll implement various use cases for facial keypoint recognition, multi-object detection, segmentation, and human pose detection. This book provides a solid foundation in image generation as you explore different GAN architectures. You’ll leverage transformer-based architectures like ViT, TrOCR, BLIP2, and LayoutLM to perform various real-world tasks and build a diffusion model from scratch. Additionally, you’ll utilize foundation models' capabilities to perform zero-shot object detection and image segmentation. Finally, you’ll learn best practices for deploying a model to production. By the end of this deep learning book, you'll confidently leverage modern NN architectures to solve real-world computer vision problems.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Benefits with Your Book

Section 1: Fundamentals of Deep Learning for Computer Vision

Free Chapter

Artificial Neural Network Fundamentals

Free Benefits with Your Book

Comparing AI and traditional machine learning

Learning about the ANN building blocks

Implementing feedforward propagation

Implementing backpropagation

Understanding the impact of the learning rate

Summarizing the training process of a neural network

Summary

Questions

PyTorch Fundamentals

Installing PyTorch

PyTorch tensors

Building a neural network using PyTorch

Using a sequential method to build a neural network

Saving and loading a PyTorch model

Summary

Questions

Building a Deep Neural Network with PyTorch

Representing an image

Why leverage neural networks for image analysis?

Preparing our data for image classification

Training a neural network

Scaling a dataset to improve model accuracy

Understanding the impact of varying the batch size

Understanding the impact of varying the loss optimizer

Building a deeper neural network

Understanding the impact of batch normalization

The concept of overfitting

Summary

Questions

Section 2: Object Classification and Detection

Introducing Convolutional Neural Networks

The problem with traditional deep neural networks

Building blocks of a CNN

Implementing a CNN

Classifying images using deep CNNs

Visualizing the outcome of feature learning

Building a CNN for classifying real-world images

Summary

Questions

Transfer Learning for Image Classification

Introducing transfer learning

Understanding the VGG16 architecture

Understanding the ResNet architecture

Implementing facial keypoint detection

Implementing age estimation and gender classification

Introducing the torch_snippets library

Summary

Questions

Practical Aspects of Image Classification

Generating CAMs

Understanding the impact of data augmentation and batch normalization

Practical aspects to take care of during model implementation

Summary

Questions

Basics of Object Detection

Introducing object detection

Creating a bounding-box ground truth for training

Understanding region proposals

Understanding IoU

Non-max suppression

Mean average precision

Training R-CNN-based custom object detectors

Training Fast R-CNN-based custom object detectors

Summary

Questions

Advanced Object Detection

Components of modern object detection algorithms

Training Faster R-CNN on a custom dataset

Working details of YOLO

Training YOLO on a custom dataset

Working details of SSD

Training SSD on a custom dataset

Summary

Questions

Image Segmentation

Exploring the U-Net architecture

Performing upscaling

Implementing semantic segmentation using U-Net

Exploring the Mask R-CNN architecture

Implementing instance segmentation using Mask R-CNN

Predicting multiple instances of multiple classes

Summary

Questions

Applications of Object Detection and Segmentation

Multi-object instance segmentation

Human pose detection

Crowd counting

Image colorization

3D object detection with point clouds

Action recognition from video

Summary

Questions

Section 3: Image Manipulation

Autoencoders and Image Manipulation

Understanding autoencoders

Understanding variational autoencoders

Performing an adversarial attack on images

Understanding neural style transfer

Understanding deepfakes

Summary

Questions

Image Generation Using GANs

Introducing GANs

Using GANs to generate handwritten digits

Using DCGANs to generate face images

Implementing conditional GANs

Summary

Questions

Advanced GANs to Manipulate Images

Leveraging the Pix2Pix GAN

Leveraging CycleGAN

Leveraging StyleGAN on custom images

Introducing SRGAN

Summary

Questions

Section 4: Combining Computer Vision with Other Techniques

Combining Computer Vision and Reinforcement Learning

Learning the basics of reinforcement learning

Implementing Q-learning

Implementing deep Q-learning

Implementing deep Q-learning with the fixed targets model

Implementing an agent to perform autonomous driving

Summary

Questions

Combining Computer Vision and NLP Techniques

Introducing transformers

Implementing ViTs

Transcribing handwritten images

Document layout analysis

Visual question answering

Summary

Questions

Foundation Models in Computer Vision

Introducing CLIP

Introducing SAM

Introducing diffusion models

Understanding Stable Diffusion

Summary

Questions

Applications of Stable Diffusion

In-painting

ControlNet

SDXL Turbo

DepthNet

Text to video

Summary

Questions

Moving a Model to Production

Understanding the basics of an API

Creating an API and making predictions on a local server

Containerizing the application

Shipping and running the Docker container on the cloud

Identifying data drift

Using vector stores

Summary

Questions

Unlock Your Exclusive Benefits

Unlock this Book’s Free Benefits in 3 Easy Steps

Other Books You May Enjoy

Index

Appendix

Chapter 1, Artificial Neural Network Fundamentals

Chapter 2, PyTorch Fundamentals

Chapter 3, Building a Deep Neural Network with PyTorch

Chapter 4, Introducing Convolutional Neural Networks

Chapter 5, Transfer Learning for Image Classification

Chapter 6, Practical Aspects of Image Classification

Chapter 7, Basics of Object Detection

Chapter 8, Advanced Object Detection

Chapter 9, Image Segmentation

Chapter 10, Applications of Object Detection and Segmentation

Chapter 11, Autoencoders and Image Manipulation

Chapter 12, Image Generation Using GANs

Chapter 13, Advanced GANs to Manipulate Images

Chapter 14, Combining Computer Vision and Reinforcement Learning

Chapter 15, Combining Computer Vision and NLP Techniques

Chapter 16, Foundation Models in Computer Vision

Chapter 17, Applications of Stable Diffusion

Chapter 18, Moving a Model to Production

Modern Computer Vision with PyTorch - Second Edition

By : V Kishore Ayyadevara, Yeshwanth Reddy

Modern Computer Vision with PyTorch

By: V Kishore Ayyadevara, Yeshwanth Reddy

Overview of this book

Understanding Stable Diffusion

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access