Book Image

The Applied AI and Natural Language Processing Workshop

By : Krishna Sankar, Jeffrey Jackovich, Ruze Richards
Book Image

The Applied AI and Natural Language Processing Workshop

By: Krishna Sankar, Jeffrey Jackovich, Ruze Richards

Overview of this book

Are you fascinated with applications like Alexa and Siri and how they accurately process information within seconds before returning accurate results? Are you looking for a practical guide that will teach you how to build intelligent applications that can revolutionize the world of artificial intelligence? The Applied AI and NLP Workshop will take you on a practical journey where you will learn how to build artificial intelligence (AI) and natural language processing (NLP) applications with Amazon Web services (AWS). Starting with an introduction to AI and machine learning, this book will explain how Amazon S3, or Amazon Simple Storage Service, works. You’ll then integrate AI with AWS to build serverless services and use Amazon’s NLP service Comprehend to perform text analysis on a document. As you advance, the book will help you get to grips with topic modeling to extract and analyze common themes on a set of documents with unknown topics. You’ll also work with Amazon Lex to create and customize a chatbot for task automation and use Amazon Rekognition for detecting objects, scenes, and text in images. By the end of The Applied AI and NLP Workshop, you’ll be equipped with the knowledge and skills needed to build scalable intelligent applications with AWS.
Table of Contents (8 chapters)
Preface

Using Amazon Comprehend to Inspect Text and Determine the Primary Language

Amazon Comprehend is used for searching and examining texts and then gathering insights from a variety of topics (health, media, telecom, education, government, and so on) and languages in the text data format. Thus, the first step to analyze text data and utilize more complex features (such as topic, entity, and sentiment analysis) is to determine the dominant language. Determining the dominant language ensures the accuracy of more in-depth analysis. To examine the text in order to determine the primary language, there are two operations (DetectDominantLanguage and BatchDetectDominantLanguage).

Both operations expect the text in the UTF-8 format with a length of at least 20 characters and a maximum of 5,000 bytes. If you are sending a list, it should not contain more than 25 items.

The response includes what language was identified using a two-letter code. The following table shows the language codes...