1. Setting Up Environment | Elasticsearch for Hadoop

Book Overview & Buying
Table Of Contents

Elasticsearch for Hadoop

By : Shukla

5 (1)

Buy this Book

Elasticsearch for Hadoop

5 (1)

By: Shukla

Buy this Book

Overview of this book

The Hadoop ecosystem is a de-facto standard for processing terra-bytes and peta-bytes of data. Lucene-enabled Elasticsearch is becoming an industry standard for its full-text search and aggregation capabilities. Elasticsearch-Hadoop serves as a perfect tool to bridge the worlds of Elasticsearch and Hadoop ecosystem to get best out of both the worlds. Powered with Kibana, this stack makes it a cakewalk to get surprising insights out of your massive amount of Hadoop ecosystem in a flash. In this book, you'll learn to use Elasticsearch, Kibana and Elasticsearch-Hadoop effectively to analyze and understand your HDFS and streaming data. You begin with an in-depth understanding of the Hadoop, Elasticsearch, Marvel, and Kibana setup. Right after this, you will learn to successfully import Hadoop data into Elasticsearch by writing MapReduce job in a real-world example. This is then followed by a comprehensive look at Elasticsearch essentials, such as full-text search analysis, queries, filters and aggregations; after which you gain an understanding of creating various visualizations and interactive dashboard using Kibana. Classifying your real-world streaming data and identifying trends in it using Storm and Elasticsearch are some of the other topics that we'll cover. You will also gain an insight about key concepts of Elasticsearch and Elasticsearch-hadoop in distributed mode, advanced configurations along with some common configuration presets you may need for your production deployments. You will have “Go production checklist” and high-level view for cluster administration for post-production. Towards the end, you will learn to integrate Elasticsearch with other Hadoop eco-system tools, such as Pig, Hive and Spark.

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Free Chapter

1. Setting Up Environment

Setting up Hadoop for Elasticsearch

Setting up Elasticsearch

Running the WordCount example

Exploring data in Head and Marvel

Summary

2. Getting Started with ES-Hadoop

Understanding the WordCount program

Going real — network monitoring data

Writing the NetworkLogsMapper job

Getting data from Elasticsearch to HDFS

Summary

3. Understanding Elasticsearch

Knowing Search and Elasticsearch

Talking to Elasticsearch

Controlling the indexing process

Elastic searching

Aggregations

Summary

4. Visualizing Big Data Using Kibana

Setting up and getting started

Discovering data

Summary

5. Real-Time Analytics

Getting started with the Twitter Trend Analyser

Injecting streaming data into Storm

Analyzing trends

Classifying tweets using percolators

Summary

6. ES-Hadoop in Production

Elasticsearch in a distributed environment

The ES-Hadoop architecture

Configuring the environment for production

Administration of clusters

Summary

7. Integrating with the Hadoop Ecosystem

Pigging out Elasticsearch

SQLizing Elasticsearch with Hive

Cascading with Elasticsearch

Giving Spark to Elasticsearch

ES-Hadoop on YARN

Summary

A. Configurations

Basic configurations

Write and query configurations

Mapping configurations

Index configurations

Network configurations

Authentication configurations

SSL configurations

Proxy configurations

Index

Elasticsearch for Hadoop

By : Shukla

Elasticsearch for Hadoop

By: Shukla

Overview of this book

Chapter 1. Setting Up Environment

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access