Book Image

Elasticsearch for Hadoop

By : Vishal Shukla
Book Image

Elasticsearch for Hadoop

By: Vishal Shukla

Overview of this book

Table of Contents (15 chapters)
Elasticsearch for Hadoop
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Pigging out Elasticsearch


For many use cases, Pig is one of the easiest ways to fiddle around with the data in the Hadoop ecosystem. Pig wins when it comes to ease of use and simple syntax in order to design dataflow pipelines, without getting into complex programming. Assuming that you know Pig, we will cover how to move the data to and from Elasticsearch. If you don't know Pig yet, never mind. You can still carry on with the steps and by the end of the section, you will at least know how to use Pig to perform data ingestion and reading with Elasticsearch.

Setting up Apache Pig for Elasticsearch

Let's start by setting up Apache Pig. At the time of writing this book, the latest Pig version available is 0.15.0. You can perform the following steps to set up the same version:

  1. Download the Pig distribution using the following command:

    $ sudo wget –O /usr/local/pig.tar.gz http://mirrors.sonic.net/apache/pig/pig-0.15.0/pig-0.15.0.tar.gz
    
  2. Extract Pig to the desired location and give it a convenient...