Book Image

Elasticsearch 5.x Cookbook - Third Edition

By : Alberto Paro
Book Image

Elasticsearch 5.x Cookbook - Third Edition

By: Alberto Paro

Overview of this book

Elasticsearch is a Lucene-based distributed search server that allows users to index and search unstructured content with petabytes of data. This book is your one-stop guide to master the complete Elasticsearch ecosystem. We’ll guide you through comprehensive recipes on what’s new in Elasticsearch 5.x, showing you how to create complex queries and analytics, and perform index mapping, aggregation, and scripting. Further on, you will explore the modules of Cluster and Node monitoring and see ways to back up and restore a snapshot of an index. You will understand how to install Kibana to monitor a cluster and also to extend Kibana for plugins. Finally, you will also see how you can integrate your Java, Scala, Python, and Big Data applications such as Apache Spark and Pig with Elasticsearch, and add enhanced functionalities with custom plugins. By the end of this book, you will have an in-depth knowledge of the implementation of the Elasticsearch architecture and will be able to manage data efficiently and effectively with Elasticsearch.
Table of Contents (25 chapters)
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Dedication
Preface

Using the HTTP protocol


This recipe shows some samples of using the HTTP protocol.

Getting ready

You need a working Elasticsearch cluster. Using the default configuration, Elasticsearch enables the 9200 port on your server to communicate in HTTP.

How to do it...

The standard RESTful protocol is easy to integrate because it is the lingua franca for the Web and can be used by every programming language.

Now, I'll show how easy it is to fetch the Elasticsearch greeting API on a running server at 9200 port using several programming languages.

In Bash or Windows prompt, the request will be:

 curl -XGET http://127.0.0.1:9200

In Python, the request will be:

  import urllib
  result = urllib.open("http://127.0.0.1:9200")

In Java, the request will be:

import java.io.BufferedReader; 
import java.io.InputStream; 
import java.io.InputStreamReader; 
import java.net.URL; 
 
... 
try {             
// get URL content 
  URL url = new URL("http://127.0.0.1:9200");              
  URLConnection conn = url.openConnection();
// open the stream and put it into BufferedReader              
  BufferedReader br = new BufferedReader(new InputStreamReader(conn.getInputStream()));               
 
String inputLine;              
while ((inputLine = br.readLine()) != null){ 
System.out.println(inputLine);              
}              
br.close();               
System.out.println("Done");           
} catch (MalformedURLException e) {             
e.printStackTrace();          
} catch (IOException e) {              
e.printStackTrace();          
}

In Scala, the request will be:

scala.io.Source.fromURL("http://127.0.0.1:9200", "utf-8").getLines.mkString("\n") 

For every language sample, the response will be the same:

{
 "name" : "elasticsearch",
 "cluster_name" : "elasticsearch",
 "cluster_uuid" : "rbCPXgcwSM6CjnX8u3oRMA",
 "version" : {
 "number" : "5.1.1",
 "build_hash" : "5395e21",
 "build_date" : "2016-12-06T12:36:15.409Z",
 "build_snapshot" : false,
 "lucene_version" : "6.3.0"
 },
 "tagline" : "You Know, for Search"
}

How it works...

Every client creates a connection to the server index / and fetches the answer. The answer is a JSON object.

You can call Elasticsearch server from any programming language that you like. The main advantages of this protocol are:

  • Portability: It uses web standards so it can be integrated in different languages (Erlang, JavaScript, Python, or Ruby) or called from command-line applications such as curl

  • Durability: The REST APIs don't often change. They don't break for minor release changes as native protocol does

  • Simple to use: It speaks JSON to JSON

  • More supported than others protocols: Every plugin typically supports a REST endpoint on HTTP

  • Easy cluster scaling: Simply put your cluster nodes behind an HTTP load balancer to balance the calls such as HAProxy or NGINX

In this book, a lot of examples are done calling the HTTP API via the command-line curl program. This approach is very fast and allows you to test functionalities very quickly.

There's more...

Every language provides drivers to best integrate Elasticsearch or RESTful web services. The Elasticsearch community provides official drivers that support the most used programming languages.