Book Image

Elasticsearch for Hadoop

By : Vishal Shukla
Book Image

Elasticsearch for Hadoop

By: Vishal Shukla

Overview of this book

Table of Contents (15 chapters)
Elasticsearch for Hadoop
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Classifying tweets using percolators


Now, we have a very simple trend analyzer that is already developed. As discussed in the first section of the chapter, we are interested in analyzing trends at a higher level, rather than analyzing them by their hashtags. In this section, we will modify our Storm Bolt to also classify the incoming data in real-time on the fly. To perform the classification, we will take a look at the hashtags of the incoming data and check whether they meet a certain criterion. Based on the this, we will tag the document with the appropriate category.

Percolator

In order to define each criterion for the categorization, we can use the queries of Elasticsearch. When we store the Elasticsearch document, we can check the Elasticsearch queries that match the given document. Percolators are the way to go to achieve this.

Generally, when you search, you have a query that you can execute in the search engine, and the search engine returns the matching documents to you. Percolators...