Elasticsearch for Hadoop

For many use cases, Pig is one of the easiest ways to fiddle around with the data in the Hadoop ecosystem. Pig wins when it comes to ease of use and simple syntax in order to design dataflow pipelines, without getting into complex programming. Assuming that you know Pig, we will cover how to move the data to and from Elasticsearch. If you don't know Pig yet, never mind. You can still carry on with the steps and by the end of the section, you will at least know how to use Pig to perform data ingestion and reading with Elasticsearch.

Setting up Apache Pig for Elasticsearch

Let's start by setting up Apache Pig. At the time of writing this book, the latest Pig version available is 0.15.0. You can perform the following steps to set up the same version:

Download the Pig distribution using the following command:

$ sudo wget –O /usr/local/pig.tar.gz http://mirrors.sonic.net/apache/pig/pig-0.15.0/pig-0.15.0.tar.gz

Extract Pig to the desired location and give it a convenient...

Elasticsearch for Hadoop

By : Vishal Shukla

Elasticsearch for Hadoop

By: Vishal Shukla

Overview of this book

Related Content you might be interested in

Current Title:

Elasticsearch for Hadoop

Pigging out Elasticsearch

Setting up Apache Pig for Elasticsearch