Amazon Comprehend is used to gather insights from a variety of topics (Health, Media, Telecom, Education, Government, and so on) and languages in text data. Thus, the first step to analyze text data and utilize more complex features (such as topic, entity, and sentiment analysis) is to determine the dominant language. Determining the dominant language ensures the accuracy of more in-depth analysis.
To examine the text in order to determine the primary language, there are two operations (DetectDominantLanguage and BatchDetectDominantLanguage).
DetectDominantLanguage accepts a UTF-8 text string that is at least 20 characters in length and must contain fewer than 5,000 bytes of UTF-8 encoded characters. BatchDetectDominantLanguage accepts an array of strings as a list. The list can contain a maximum of 25 documents. Each document should have at least 20 characters, and must contain fewer than 5,000 bytes of UTF-8 encoded...