Book Image

Python Image Processing Cookbook

By : Sandipan Dey
Book Image

Python Image Processing Cookbook

By: Sandipan Dey

Overview of this book

With the advancements in wireless devices and mobile technology, there's increasing demand for people with digital image processing skills in order to extract useful information from the ever-growing volume of images. This book provides comprehensive coverage of the relevant tools and algorithms, and guides you through analysis and visualization for image processing. With the help of over 60 cutting-edge recipes, you'll address common challenges in image processing and learn how to perform complex tasks such as object detection, image segmentation, and image reconstruction using large hybrid datasets. Dedicated sections will also take you through implementing various image enhancement and image restoration techniques, such as cartooning, gradient blending, and sparse dictionary learning. As you advance, you'll get to grips with face morphing and image segmentation techniques. With an emphasis on practical solutions, this book will help you apply deep learning techniques such as transfer learning and fine-tuning to solve real-world problems. By the end of this book, you'll be proficient in utilizing the capabilities of the Python ecosystem to implement various image processing techniques effectively.
Table of Contents (11 chapters)

Object detection with Mask R-CNN

The Mask R-CNN algorithm (2017), by Girshick et al., includes a number of improvements compared with the Faster R-CNN algorithm for region-based object detection, with the following two primary contributions:

  • ROI Pooling is replaced with an ROI Align module (which is more accurate).
  • An additional branch is inserted (which receives the output from ROI Align, subsequently feeding it into two successive convolution layers. Output from the last convolutional layer forms the object mask) at the output of the ROI Align module.

The RoIAlign module provides a more precise correspondence between the regions of the feature map selected and those of the input image. Much more fine-grained alignment is needed for pixel-level segmentation, rather than just computing the bounding boxes. The following screenshot shows the architecture of Mask R-CNN: