Book Image

Intelligent Document Processing with AWS AI/ML

By : Sonali Sahu
Book Image

Intelligent Document Processing with AWS AI/ML

By: Sonali Sahu

Overview of this book

With the volume of data growing exponentially in this digital era, it has become paramount for professionals to process this data in an accelerated and cost-effective manner to get value out of it. Data that organizations receive is usually in raw document format, and being able to process these documents is critical to meeting growing business needs. This book is a comprehensive guide to helping you get to grips with AI/ML fundamentals and their application in document processing use cases. You’ll begin by understanding the challenges faced in legacy document processing and discover how you can build end-to-end document processing pipelines with AWS AI services. As you advance, you'll get hands-on experience with popular Python libraries to process and extract insights from documents. This book starts with the basics, taking you through real industry use cases for document processing to deliver value-based care in the healthcare industry and accelerate loan application processing in the financial industry. Throughout the chapters, you'll find out how to apply your skillset to solve practical problems. By the end of this AWS book, you’ll have mastered the fundamentals of document processing with machine learning through practical implementation.
Table of Contents (16 chapters)
1
Part 1: Accurate Extraction of Documents and Categorization
6
Part 2: Enrichment of Data and Post-Processing of Data
10
Part 3: Intelligent Document Processing in Industry Use Cases

Summary

In this chapter, we discussed the current challenges in document processing and how IDP can help overcome those challenges. We introduced IDP by tracing the origins of AI, how it has evolved over the last few decades, and how AI became an integral part of our everyday lives.

We then reviewed industry trends and market segmentation and saw with examples how important it is to automate document processing. We also discussed IDP across industry use cases. We read an example of how patient data can be collected and enriched to better patient outcome prediction.

Finally, we reviewed the stages of the IDP pipeline such as data capture, data classification, data extraction, data enrichment, and data post-processing. This chapter gave readers an understanding of IDP and the various stages involved to automate the end-to-end pipeline.

In the next chapter, we will go through the details of the data capture stage and document classification with AWS AI services. We will also look into the details of AWS AI services such as Amazon Comprehend custom classification and Amazon Rekognition for document classification.