Book Image

Elasticsearch for Hadoop

By : Vishal Shukla
Book Image

Elasticsearch for Hadoop

By: Vishal Shukla

Overview of this book

Table of Contents (15 chapters)
Elasticsearch for Hadoop
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Cascading with Elasticsearch


Cascading abstracts out the complexities of MapReduce by providing a platform for data processing in terms of pipes and taps. This section may be of interest to you if you already use cascading in your projects, or if you are already aware about cascading and wish to integrate it into Elasticsearch. Hence, a basic cascading knowledge is assumed for this section.

ES-Hadoop comes with a dedicated EsTap that implements SourceSink and SourceTap to provide plug points to integrate it into cascading.

Importing data to Elasticsearch

Let's write a cascading job to import data from HDFS to Elasticsearch.

Writing a cascading job

Here is code for the main() method that tells you how to cascade a job's driver class:

Properties props = new Properties();
props.setProperty("es.mapping.id", "id");
FlowConnector flow = new HadoopFlowConnector(props);

ES-Hadoop provides all the standard configurations that you learned earlier are specified in the Properties object. The Properties object...