Book Image

Computer Vision on AWS

By : Lauren Mullennex, Nate Bachmeier, Jay Rao
Book Image

Computer Vision on AWS

By: Lauren Mullennex, Nate Bachmeier, Jay Rao

Overview of this book

Computer vision (CV) is a field of artificial intelligence that helps transform visual data into actionable insights to solve a wide range of business challenges. This book provides prescriptive guidance to anyone looking to learn how to approach CV problems for quickly building and deploying production-ready models. You’ll begin by exploring the applications of CV and the features of Amazon Rekognition and Amazon Lookout for Vision. The book will then walk you through real-world use cases such as identity verification, real-time video analysis, content moderation, and detecting manufacturing defects that’ll enable you to understand how to implement AWS AI/ML services. As you make progress, you'll also use Amazon SageMaker for data annotation, training, and deploying CV models. In the concluding chapters, you'll work with practical code examples, and discover best practices and design principles for scaling, reducing cost, improving the security posture, and mitigating bias of CV workloads. By the end of this AWS book, you'll be able to accelerate your business outcomes by building and implementing CV into your production environments with the help of AWS AI/ML services.
Table of Contents (21 chapters)
1
Part 1: Introduction to CV on AWS and Amazon Rekognition
5
Part 2: Applying CV to Real-World Use Cases
9
Part 3: CV at the edge
12
Part 4: Building CV Solutions with Amazon SageMaker
15
Part 5: Best Practices for Production-Ready CV Workloads

Summary

You have successfully built a video analysis pipeline for real-time video streams using OpenCV and the RTSP protocol to sample frames from IP cameras. You processed a short recording from my iPhone using the Amazon Rekognition Video API to identify time offsets and PIL to draw bounding boxes around people.

You assembled an (almost) production-ready pipeline that asynchronously processes files and uses a completion notification to avoid pulling for status. This foundational API opens the door to sports highlights, safety systems, and improving building layouts, among other potential scenarios. Suppose a traditional store reviewed the footage and determined that its customers gravitate toward specific areas. In that case, they should place high-margin items in those areas to increase sales.

One of the limitations of our solution is that indexes, not names, reference humans. In the last chapter, you mitigated this issue using Amazon Rekognition’s Face API to build...