Elasticsearch for Hadoop

Now that we have an idea of how to write a job that does some processing with MapReduce and pushes the data from HDFS to Elasticsearch, let's try out a real-world example to get the feel of what value we can get by performing this the ES-Hadoop way.

For illustration purposes, let's consider an example dataset of log files from a hypothetical network security and a monitoring tool. This tool acts as a gateway-cum-firewall between the devices connected in the network and the Internet. The firewall detects viruses or spyware, checks the category of the outgoing traffic, and blocks or allows the request based on the configured policies.

Getting and understanding the data

You can download the sample data generated by the tool at https://github.com/vishalbrevitaz/eshadoop/tree/master/ch02. Here is a snippet of the data for you to take a quick look:

Jan 01 12:26:26 src="10.1.1.89:0" dst="74.125.130.95"  id="None" act="ALLOW" msg="fonts.googleapis.com/css?family...

Elasticsearch for Hadoop

By : Vishal Shukla

Elasticsearch for Hadoop

By: Vishal Shukla

Overview of this book

Related Content you might be interested in

Current Title:

Elasticsearch for Hadoop

Going real — network monitoring data

Getting and understanding the data