Book Image

Elasticsearch 5.x Cookbook - Third Edition

By : Alberto Paro
Book Image

Elasticsearch 5.x Cookbook - Third Edition

By: Alberto Paro

Overview of this book

Elasticsearch is a Lucene-based distributed search server that allows users to index and search unstructured content with petabytes of data. This book is your one-stop guide to master the complete Elasticsearch ecosystem. We’ll guide you through comprehensive recipes on what’s new in Elasticsearch 5.x, showing you how to create complex queries and analytics, and perform index mapping, aggregation, and scripting. Further on, you will explore the modules of Cluster and Node monitoring and see ways to back up and restore a snapshot of an index. You will understand how to install Kibana to monitor a cluster and also to extend Kibana for plugins. Finally, you will also see how you can integrate your Java, Scala, Python, and Big Data applications such as Apache Spark and Pig with Elasticsearch, and add enhanced functionalities with custom plugins. By the end of this book, you will have an in-depth knowledge of the implementation of the Elasticsearch architecture and will be able to manage data efficiently and effectively with Elasticsearch.
Table of Contents (25 chapters)
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Dedication
Preface

Communicating with Elasticsearch


In Elasticsearch 5.x, there are only two ways to communicate with the server using HTTP protocol or the native one. In this recipe, we will take a look at these main protocols.

Getting ready

The standard installation of Elasticsearch provides access via its web services on port 9200 for HTTP and 9300 for native Elasticsearch protocol. Simply starting an Elasticsearch server, you can communicate on these ports with it.

How it works...

Elasticsearch is designed to be used as a RESTful server, so the main protocol is the HTTP usually on port 9200 and above. This is the only protocol that can be used by programming languages that don't run on a Java Virtual Machine (JVM).

Every protocol has advantages and disadvantages. It's important to choose the correct one depending on the kind of applications you are developing. If you are in doubt, choose the HTTP protocol layer that is the most standard and easy to use.

Choosing the right protocol depends on several factors, mainly architectural and performance related. This schema factorizes the advantages and disadvantages related to them:

Protocol

Advantages

Disadvantages

Type

HTTP

This is more frequently used. It is API safe and has general compatibility for different ES versions. Suggested. JSON.

It is easy to proxy and to balance with HTTP balancers.

This is an HTTP overhead. HTTP clients don't know the cluster topology, so they require more hops to access data.

Text

Native

This is a fast network layer. It is programmatic. It is best for massive index operations.

The API changes and breaks applications. It depends on the same version of ES Server. Only on JVM.

It is more compact due to its binary nature.

It is faster because the clients know the cluster topology.

The native serializer/deserializer are more efficient than the JSON ones.

Binary