Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Modern Computer Vision with PyTorch
  • Table Of Contents Toc
Modern Computer Vision with PyTorch

Modern Computer Vision with PyTorch - Second Edition

By : V Kishore Ayyadevara, Yeshwanth Reddy
4 (21)
close
close
Modern Computer Vision with PyTorch

Modern Computer Vision with PyTorch

4 (21)
By: V Kishore Ayyadevara, Yeshwanth Reddy

Overview of this book

Whether you are a beginner or are looking to progress in your computer vision career, this book guides you through the fundamentals of neural networks (NNs) and PyTorch and how to implement state-of-the-art architectures for real-world tasks. The second edition of Modern Computer Vision with PyTorch is fully updated to explain and provide practical examples of the latest multimodal models, CLIP, and Stable Diffusion. You’ll discover best practices for working with images, tweaking hyperparameters, and moving models into production. As you progress, you'll implement various use cases for facial keypoint recognition, multi-object detection, segmentation, and human pose detection. This book provides a solid foundation in image generation as you explore different GAN architectures. You’ll leverage transformer-based architectures like ViT, TrOCR, BLIP2, and LayoutLM to perform various real-world tasks and build a diffusion model from scratch. Additionally, you’ll utilize foundation models' capabilities to perform zero-shot object detection and image segmentation. Finally, you’ll learn best practices for deploying a model to production. By the end of this deep learning book, you'll confidently leverage modern NN architectures to solve real-world computer vision problems.
Table of Contents (27 chapters)
close
close
1
Section 1: Fundamentals of Deep Learning for Computer Vision
5
Section 2: Object Classification and Detection
13
Section 3: Image Manipulation
17
Section 4: Combining Computer Vision with Other Techniques
24
Other Books You May Enjoy
25
Index

Chapter 9, Image Segmentation

  1. How does upscaling help in the U-Net architecture?

Upscaling helps the feature map to increase in size so that the final output is the same size as the input size.

  1. Why do we need to have a fully convolutional network in U-Net?

Because both the inputs and outputs are images, it is difficult to predict an image-shaped tensor using the linear layer.

  1. How does RoI Align improve upon RoI pooling in Mask R-CNN?

RoI Align takes offsets of predicted proposals to fine-align the feature map.

  1. What is the major difference between U-Net and Mask R-CNN for segmentation?

U-Net is fully convolutional and has a single end2end network, whereas Mask R-CNN uses mini networks, such as Backbone, RPN, etc, to do different tasks. Mask R-CNN is capable of identifying and separating several objects of the same type, but U-Net can only identify (not separate them into individual instances).

  1. What...
CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Modern Computer Vision with PyTorch
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist download Download options font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon