The customer for whom we work also keeps track of what its competitors are doing. They keep a close eye on all the press releases, job postings, and public interactions of competitors. Information about competitors from various sources is captured in text format and fed to the Hadoop system. Every weekend, analysis is done to see trending topics and words, which are used by competitors to guess about the areas they are working or investing in.
The preceding paragraph is an example of first-level text analytics problem in Big Data space. To solve this problem, we will run classic word count using MapReduce. We will use it for word count each time a given word appears in all of the documents.