Book Image

Hands-On Computer Vision with TensorFlow 2

By : Benjamin Planche, Eliot Andres
Book Image

Hands-On Computer Vision with TensorFlow 2

By: Benjamin Planche, Eliot Andres

Overview of this book

Computer vision solutions are becoming increasingly common, making their way into fields such as health, automobile, social media, and robotics. This book will help you explore TensorFlow 2, the brand new version of Google's open source framework for machine learning. You will understand how to benefit from using convolutional neural networks (CNNs) for visual tasks. Hands-On Computer Vision with TensorFlow 2 starts with the fundamentals of computer vision and deep learning, teaching you how to build a neural network from scratch. You will discover the features that have made TensorFlow the most widely used AI library, along with its intuitive Keras interface. You'll then move on to building, training, and deploying CNNs efficiently. Complete with concrete code examples, the book demonstrates how to classify images with modern solutions, such as Inception and ResNet, and extract specific content using You Only Look Once (YOLO), Mask R-CNN, and U-Net. You will also build generative adversarial networks (GANs) and variational autoencoders (VAEs) to create and edit images, and long short-term memory networks (LSTMs) to analyze videos. In the process, you will acquire advanced insights into transfer learning, data augmentation, domain adaptation, and mobile and web deployment, among other key concepts. By the end of the book, you will have both the theoretical understanding and practical skills to solve advanced computer vision problems with TensorFlow 2.0.
Table of Contents (16 chapters)
Free Chapter
1
Section 1: TensorFlow 2 and Deep Learning Applied to Computer Vision
5
Section 2: State-of-the-Art Solutions for Classic Recognition Problems
9
Section 3: Advanced Concepts and New Frontiers of Computer Vision
14
Assessments

Faster R-CNN – a powerful object detection model

The main benefit of YOLO is its speed. While it can achieve very good results, it is now outperformed by more complex networks. Faster Region with Convolutional Neural Networks (Faster R-CNN) is considered state of the art at the time of writing. It is also quite fast, reaching 4-5 FPS on a modern GPU. In this section, we will explore its architecture.

The Faster R-CNN architecture was engineered over several years of research. More precisely, it was built incrementally from two architectures—R-CNN and Fast R-CNN. In this section, we will focus on the latest architecture, Faster R-CNN:

  • Faster R-CNN: towards real-time object detection with region proposal networks (2015), Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun

This paper draws a lot of knowledge from the two previous designs. Therefore, some of the...