Book Image

Computer Vision on AWS

By : Lauren Mullennex, Nate Bachmeier, Jay Rao
Book Image

Computer Vision on AWS

By: Lauren Mullennex, Nate Bachmeier, Jay Rao

Overview of this book

Computer vision (CV) is a field of artificial intelligence that helps transform visual data into actionable insights to solve a wide range of business challenges. This book provides prescriptive guidance to anyone looking to learn how to approach CV problems for quickly building and deploying production-ready models. You’ll begin by exploring the applications of CV and the features of Amazon Rekognition and Amazon Lookout for Vision. The book will then walk you through real-world use cases such as identity verification, real-time video analysis, content moderation, and detecting manufacturing defects that’ll enable you to understand how to implement AWS AI/ML services. As you make progress, you'll also use Amazon SageMaker for data annotation, training, and deploying CV models. In the concluding chapters, you'll work with practical code examples, and discover best practices and design principles for scaling, reducing cost, improving the security posture, and mitigating bias of CV workloads. By the end of this AWS book, you'll be able to accelerate your business outcomes by building and implementing CV into your production environments with the help of AWS AI/ML services.
Table of Contents (21 chapters)
Part 1: Introduction to CV on AWS and Amazon Rekognition
Part 2: Applying CV to Real-World Use Cases
Part 3: CV at the edge
Part 4: Building CV Solutions with Amazon SageMaker
Part 5: Best Practices for Production-Ready CV Workloads

Exploring AWS AI/ML services

There are many challenges faced when building and deploying a production CV model. It’s often difficult to find the right ML skill sets. Gathering high-quality data and labeling the data is a manual and costly process. Data processing and feature engineering require domain expertise. Developing, training, and testing ML models takes time. Once a model is created and deployed into production, it’s challenging to scale on-premises and difficult to understand which metrics to monitor to detect data and model quality drift. Reducing inference latency, automating the retraining process, and managing the underlying infrastructure are also concerns.

AWS AI/ML services are designed to address these challenges. These services are fully managed, so you don’t have to worry about their underlying architecture. You can also optimize your costs by only paying for what you use. Within the portfolio of AWS AI/ML services, there are several approaches to choose from when building your CV application.

AWS AI services

AWS AI services provide pre-trained models that use DL technology to solve common use cases such as image classification, personalized recommendations, fraud detection, anomaly detection, and NLP. These services don’t require any ML expertise and they’re easily integrated into your applications or with other AWS services by calling APIs. They help remove the undifferentiated heavy lifting of dealing with image preprocessing and feature extraction. This way, you can focus on solving your business problems and moving to production faster.

One of the AI services for CV is Amazon Rekognition. It is a fully managed DL-based service that detects objects, people, activities, scenes, text, and inappropriate content in images and videos. It also provides facial analysis and facial search capabilities. Rekognition contains pre-trained models but also allows you to train your own custom model using Rekognition Custom Models. In the next two chapters, we provide code examples and applications of Rekognition and Rekognition Custom Models.

Amazon Lookout for Vision ( is another AI service that uses CV to detect anomalies and defects in manufacturing. Using pre-trained models, it helps improve industrial quality assurance by analyzing images to identify objects with visual defects. This helps improve your operational efficiency. In Chapter 7, we go into more detail about using Lookout for Vision.

For building and managing CV applications at the edge, AWS Panorama ( provides ML devices and a software development kit (SDK) to add CV to your cameras.

This helps to automate costly inspection tasks by building CV applications to analyze video feeds. The Panorama appliance performs predictions locally for real-time decision-making. With Panorama, you can train your own models or select pre-built applications from AWS or third-party vendors.

These are only a few examples of the AWS AI services we will be focusing on in this book. For more details on the pre-trained services available for your applications, visit the AWS Machine Learning | AI Services page (

Amazon SageMaker

If you are interested in fine-tuning a pre-trained model, using built-in algorithms, or building your own custom ML model, Amazon SageMaker ( is a comprehensive fully managed ML service that allows you to prepare data, build, train, and deploy ML models for any use case. SageMaker provides the infrastructure, tools, visual interfaces, workflows, and MLOps capabilities for every step of the ML life cycle to help you deploy and manage models at scale. SageMaker also contains an integrated development environment (IDE) called SageMaker Studio where you can perform all steps within the ML life cycle and orchestrate continuous integration/continuous deployment (CI/CD) pipelines. For more information on SageMaker Studio, refer to the book Getting Started with SageMaker Studio, by Michael Hsieh (

Figure 1.9 – Amazon SageMaker features and capabilities

Figure 1.9 – Amazon SageMaker features and capabilities

With SageMaker, you can use transfer learning (TL) to fine-tune and reuse a pre-trained model without training a model from scratch. This saves you time and allows you to transfer the domain knowledge you gained previously to solve a new ML problem. This technique can be applied to CV or any type of business problem.

SageMaker contains dozens of pre-built algorithms that are optimized for speed, scale, and accuracy. They include support for supervised and unsupervised algorithms to solve a variety of use cases, including CV-related problems such as image classification, object detection, and semantic segmentation.

If a pre-trained or pre-built solution does not fit your needs, you have the option to build a custom ML model. There is a variety of powerful CPU and GPU compute options available for training and hosting your model on SageMaker. In Chapter 12, we will build a custom CV model on SageMaker to classify different types of skin cancer.

In this section, we provided an overview of the AWS AI/ML services related to CV. Next, we will show you how to set up the AWS environment that you will use throughout this book to build CV solutions.