Elasticsearch for Hadoop

Cascading abstracts out the complexities of MapReduce by providing a platform for data processing in terms of pipes and taps. This section may be of interest to you if you already use cascading in your projects, or if you are already aware about cascading and wish to integrate it into Elasticsearch. Hence, a basic cascading knowledge is assumed for this section.

ES-Hadoop comes with a dedicated EsTap that implements SourceSink and SourceTap to provide plug points to integrate it into cascading.

Importing data to Elasticsearch

Let's write a cascading job to import data from HDFS to Elasticsearch.

Writing a cascading job

Here is code for the main() method that tells you how to cascade a job's driver class:

Properties props = new Properties();
props.setProperty("es.mapping.id", "id");
FlowConnector flow = new HadoopFlowConnector(props);

ES-Hadoop provides all the standard configurations that you learned earlier are specified in the Properties object. The Properties object...

Elasticsearch for Hadoop

By : Vishal Shukla

Elasticsearch for Hadoop

By: Vishal Shukla

Overview of this book

Related Content you might be interested in

Current Title:

Elasticsearch for Hadoop

Cascading with Elasticsearch

Importing data to Elasticsearch

Writing a cascading job