Book Image

Advanced Elasticsearch 7.0

By : Wai Tak Wong
Book Image

Advanced Elasticsearch 7.0

By: Wai Tak Wong

Overview of this book

Building enterprise-grade distributed applications and executing systematic search operations call for a strong understanding of Elasticsearch and expertise in using its core APIs and latest features. This book will help you master the advanced functionalities of Elasticsearch and understand how you can develop a sophisticated, real-time search engine confidently. In addition to this, you'll also learn to run machine learning jobs in Elasticsearch to speed up routine tasks. You'll get started by learning to use Elasticsearch features on Hadoop and Spark and make search results faster, thereby improving the speed of query results and enhancing the customer experience. You'll then get up to speed with performing analytics by building a metrics pipeline, defining queries, and using Kibana for intuitive visualizations that help provide decision-makers with better insights. The book will later guide you through using Logstash with examples to collect, parse, and enrich logs before indexing them in Elasticsearch. By the end of this book, you will have comprehensive knowledge of advanced topics such as Apache Spark support, machine learning using Elasticsearch and scikit-learn, and real-time analytics, along with the expertise you need to increase business productivity, perform analytics, and get the very best out of Elasticsearch.
Table of Contents (25 chapters)
Free Chapter
1
Section 1: Fundamentals and Core APIs
8
Section 2: Data Modeling, Aggregations Framework, Pipeline, and Data Analytics
13
Section 3: Programming with the Elasticsearch Client
16
Section 4: Elastic Stack
20
Section 5: Advanced Features

Talking to Elasticsearch

Many programming languages (including Java, Python, and .NET) have official clients written and supported by Elasticsearch (https://www.elastic.co/guide/en/elasticsearch/client/index.html). However, by default, only two protocols are really supported, HTTP (via a RESTful API) and native. You can talk to Elasticsearch via one of the following ways:

  • Transport client: One of the native ways to connect to Elasticsearch.
  • Node client: Similar to the transport client. In most cases, if you're using Java, you should choose the transport client instead of the node client.
  • HTTP client: For most programming languages, HTTP is the most common way to connect to Elasticsearch.
  • Other protocols: It's possible to create a new client interface to Elasticsearch simply by writing a plugin.
Transport clients (that is, the Java API) are scheduled to be deprecated in Elasticsearch 7.0 and completely removed in 8.0. Java users should use a Java High Level REST Client.

You can communicate with Elasticsearch via the default 9200 port using the RESTful API. An example of using the curl command to communicate with Elasticsearch from the command line is shown in the following code block. You should see the instance details and the cluster information in the response. Before running the following command, make sure the installed Elasticsearch server is running. In the response, the machine's hostname is wai. The default Elasticsearch cluster name is elasticsearch. The version of Elasticsearch that is running is 7.0.0. The downloaded Elasticsearch software is in TAR format. The version of Lucene used is 8.0.0:

curl -XGET 'http://localhost:9200'
{
"name" : "wai",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "7-fjLIFkQrednHgFh0Ufxw",
"version" : {
"number" : "7.0.0",
"build_flavor" : "default",
"build_type" : "tar",
"build_hash" : "a30e8c2",
"build_date" : "2018-12-17T12:33:32.311168Z",
"build_snapshot" : false,
"lucene_version" : "8.0.0",
"minimum_wire_compatibility_version" : "6.6.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}

Using Postman to work with the Elasticsearch REST API

The Postman app is a handy tool for testing the REST API. In this book, we'll use Postman to illustrate the examples. The following are step-by-step instructions for installing Postman from the official download site (https://www.getpostman.com/apps):

  1. Select Package Management (Windows, macOS, or Linux) and download the appropriate 32-/64-bit version for your operating system. For 64-bit Linux package management, the filename is Postman-linux-x64-6.6.1.tar.gz.
  2. Extract the GNU zipped file into your target directory, which will generate a folder called Postman:
tar -zxvf Postman-linux-x64-6.6.1.tar.gz
  1. Go to the folder and run Postman and you'll see a pop-up window:
cd Postman
./Postman
  1. In the pop-up window, use the same URL as in the previous curl command and press the Send button. You will get the same output shown as follows:

In the next section, let's dive into the architectural overview of Elasticsearch.