Book Image

Learning Elasticsearch

By : Abhishek Andhavarapu
Book Image

Learning Elasticsearch

By: Abhishek Andhavarapu

Overview of this book

Elasticsearch is a modern, fast, distributed, scalable, fault tolerant, and open source search and analytics engine. You can use Elasticsearch for small or large applications with billions of documents. It is built to scale horizontally and can handle both structured and unstructured data. Packed with easy-to- follow examples, this book will ensure you will have a firm understanding of the basics of Elasticsearch and know how to utilize its capabilities efficiently. You will install and set up Elasticsearch and Kibana, and handle documents using the Distributed Document Store. You will see how to query, search, and index your data, and perform aggregation-based analytics with ease. You will see how to use Kibana to explore and visualize your data. Further on, you will learn to handle document relationships, work with geospatial data, and much more, with this easy-to-follow guide. Finally, you will see how you can set up and scale your Elasticsearch clusters in production environments.
Table of Contents (11 chapters)
10
Exploring Elastic Stack (Elastic Cloud, Security, Graph, and Alerting)

Aggregation basics

Aggregation is one of many reasons why Elasticsearch is nothing like anything out there; it is an analytics engine on steroids. Aggregation operations, such as distinct, count, and average on large data sets, are traditionally run on batch processing systems, such as Hadoop, due to the heavy computation involved. As running these kind of queries on a large dataset using a traditional SQL database can be very challenging. Elasticsearch enables these queries to run in real-time sub-second queries. In my first project with Elasticsearch, we solely used Elasticsearch for its aggregation capabilities and few search capabilities.

Aggregations in Elasticsearch are very powerful as you can nest aggregations. Let's take a query from the SQL world:

select avg(rating) from Product group by category; 

To execute the query, the products are first grouped by category...