Book Image

ElasticSearch Cookbook

By : Alberto Paro
Book Image

ElasticSearch Cookbook

By: Alberto Paro

Overview of this book

ElasticSearch is one of the most promising NoSQL technologies available and is built to provide a scalable search solution with built-in support for near real-time search and multi-tenancy. This practical guide is a complete reference for using ElasticSearch and covers 360 degrees of the ElasticSearch ecosystem. We will get started by showing you how to choose the correct transport layer, communicate with the server, and create custom internal actions for boosting tailored needs. Starting with the basics of the ElasticSearch architecture and how to efficiently index, search, and execute analytics on it, you will learn how to extend ElasticSearch by scripting and monitoring its behaviour. Step-by-step, this book will help you to improve your ability to manage data in indexing with more tailored mappings, along with searching and executing analytics with facets. The topics explored in the book also cover how to integrate ElasticSearch with Python and Java applications. This comprehensive guide will allow you to master storing, searching, and analyzing data with ElasticSearch.
Table of Contents (19 chapters)
ElasticSearch Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Using the Thrift protocol


Thrift is an interface definition language, initially developed by Facebook, used to define and create services. This protocol is now in the Apache Software Foundation.

Its usage is similar to HTTP, but it bypasses the limit of HTTP protocol (latency, handshake, and so on) and it's faster.

Getting ready

You need a working ElasticSearch cluster with the thrift plugin installed (https://github.com/elasticsearch/elasticsearch-transport-thrift/) the standard port for thrift protocol is 9500.

How to do it…

In java using ElasticSearch generated classes, creating a client is quite easy as shown in the following code snippet:

import org.apache.thrift.protocol.TBinaryProtocol;
import org.apache.thrift.protocol.TProtocol;
import org.apache.thrift.transport.TSocket;
import org.apache.thrift.transport.TTransport;
import org.apache.thrift.transport.TTransportException;
import org.elasticsearch.thrift.*;


TTransport transport = new TSocket("127.0.0.1", 9500);
TProtocol protocol = new TBinaryProtocol(transport);
Rest.Client client = new Rest.Client(protocol);
transport.open();

How it works…

To initialize a connection, first we need to open a socket transport. This is done with the TSocket (host/port), using the ElasticSearch thrift standard port 9500.

Then the Socket Transport Protocol must be encapsulated in a Binary Protocol—this is done with the TBinaryProtocol (transport).

Now, a client can be initialized by passing the protocol. The Rest-Client and other utilities classes are generated by elasticsearch.thrift, and live in the org.elasticsearch.thrift namespace.

To have a fully working client, we must open the socket (transport.open()).

At the end of program, we should clean the socket closing it (transport.close()).

There's more...

Some drivers to connect to ElasticSearch provide a simple to use API to interact with thrift without the boulder that this protocol needs.

For advanced usage, I suggest the use of the Thrift protocol to bypass some problems related with HTTP limits. They are as follows:

  • The number of simultaneous connections required in HTTP—thrift transport is less resource angry

  • The network traffic is light reduced to its binary nature

A big advantage of this protocol is that on server side it wraps the REST entry points so it can be also used with calls provided by external REST plugins.

See also