Book Image

Elasticsearch for Hadoop

By : Vishal Shukla
Book Image

Elasticsearch for Hadoop

By: Vishal Shukla

Overview of this book

Table of Contents (15 chapters)
Elasticsearch for Hadoop
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Analyzing trends


Once we get the tweets in Elasticsearch, we are ready to start analyzing the tweets using the power of Elasticsearch. In our analysis of tweets, we are mainly interested in trends, as we discussed earlier in the chapter.

In order to perform a trend analysis, it is good to know how we define the trend. A trend is something that is more frequent than usual over a specific time range, location, or field. In other words, it is about knowing what is unusually common. We will essentially try to find some significant changes in normal scenarios.

This process can be seen as an anomaly detection as well, in that we will try to find what is it that principally deviates from our overall dataset. We can call this full dataset a background dataset, and the dataset we are interested in for a specific time range or location can be called a foreground dataset. For example, general occurrences of the word Dornier is about one in 1 million tweets in the background dataset; however, if you take...