Book Image

Learning Elastic Stack 6.0

By : Pranav Shukla, Sharath Kumar M N
Book Image

Learning Elastic Stack 6.0

By: Pranav Shukla, Sharath Kumar M N

Overview of this book

The Elastic Stack is a powerful combination of tools for distributed search, analytics, logging, and visualization of data from medium to massive data sets. The newly released Elastic Stack 6.0 brings new features and capabilities that empower users to find unique, actionable insights through these techniques. This book will give you a fundamental understanding of what the stack is all about, and how to use it efficiently to build powerful real-time data processing applications. After a quick overview of the newly introduced features in Elastic Stack 6.0, you’ll learn how to set up the stack by installing the tools, and see their basic configurations. Then it shows you how to use Elasticsearch for distributed searching and analytics, along with Logstash for logging, and Kibana for data visualization. It also demonstrates the creation of custom plugins using Kibana and Beats. You’ll find out about Elastic X-Pack, a useful extension for effective security and monitoring. We also provide useful tips on how to use the Elastic Cloud and deploy the Elastic Stack in production environments. On completing this book, you’ll have a solid foundational knowledge of the basic Elastic Stack functionalities. You’ll also have a good understanding of the role of each component in the stack to solve different data processing problems.
Table of Contents (19 chapters)
Title Page
Credits
Disclaimer
About the Authors
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Use cases of Elastic Stack


Elastic Stack components have a variety of practical use cases, and new use cases are emerging as more plugins are added to existing components. As mentioned earlier, you may use a subset of the components for your use case. The following example use cases are by no means exhaustive, but are some of the most common ones:

  • Log and security analytics
  • Product search
  • Metrics analytics
  • Web search and website search

Let us look at each use case.

Log and security analytics

The Elasticsearch, Logstash, and Kibana trio was very popular as an ELK stack previously. The presence of Elasticsearch, Logstash, and Kibana (also known as ELK) makes Elastic Stack an excellent stack for aggregating and analyzing logs in a central place.

The application support teams face a great challenge administering and managing large numbers of applications deployed across tens or hundreds of servers. The application infrastructure could have the following components:

  • Web servers
  • Application servers
  • Database servers
  • Message brokers

Typically, enterprise applications have all or most of the types of servers which were explained earlier, and there are multiple instances of each server. In the event of an error or production issue, the support team has to log in to individual servers and look at the errors. It is quite inefficient to log in to individual servers and look at the raw log files. Elastic Stack provides a complete tool set to collect, centralize, analyze, visualize, alert, and report the errors as they occur. Here is how each component can be used to solve this problem:

  • The Beats framework, Filebeat in particular, can run as a lightweight agent to collect and forward the logs.
  • Logstash can centralize the events received from Beats, and parse and transform each log entry before sending it to the Elasticsearch cluster.
  • Elasticsearch indexes the logs. It enables both search and analytics on the parsed logs.
  • Kibana then lets you create visualizations based on errors, warnings, and other information logs. It lets you create dashboards where you can centrally monitor events as they occur, in real time.
  • With X-Pack, you can secure the solution, configure alerts, get reports, and analyze relationships in the data.

As you can see, you can get a complete log aggregation and monitoring solution using Elastic Stack.

A security analytics solution would be very similar to this; the logs and events being fed into the system would pertain to firewalls, switches, and other key network elements.

Product search

Product search involves searching for the most relevant product from thousands or tens of thousands of products and presenting the most relevant products at the top of the list before the other less relevant products. You can directly relate this problem to e-commerce websites which sell huge numbers of products sold by many vendors or resellers.

Elasticsearch's full-text and relevance search capabilities can find the best matching results. Presenting the best matches on the first page has great value as it increases the chances of the customer actually buying the product. Imagine a customer searching for the iPhone 7, and the results on the first page showing different cases, chargers, and accessories for previous iPhone versions. The text analysis capabilities backed by Lucene, and innovations added by Elasticsearch, ensure that you get iPhone 7 chargers and cases after the best match.

This problem, however, is not limited to e-commerce websites. Any application that needs to find the most relevant item from millions or billions of items can use Elasticsearch to solve this problem.

Metrics analytics

Elastic Stack has excellent analytics capabilities thanks to the rich aggregations API in Elasticsearch. This makes it a perfect tool for analyzing data with lots of metrics. Metric data consists of numeric values as opposed to unstructured text such as documents and web pages. Some examples are data generated by sensors, IoT devices, metrics generated by mobile devices, servers, virtual machines, network routers, switches, and so on. The list is endless.

Metric data is typically also of the time series nature, that is, values or measures are recorded over the period of time. The metrics that are recorded are usually related to some entity. For example, a temperature reading (which is a metric) is recorded for a particular sensor device with a certain identifier. The type, name of the building, department, floor, and so on are the dimensions associated with the metric. The dimensions may also include the location of the sensor device, that is, the longitude and latitude.

Elasticsearch and Kibana allow for the slicing and dicing of metric data along different dimensions to provide deep insight about your data. Elasticsearch is very powerful at handling time-series and geo-spatial data, which means you can plot your metrics on line charts and area charts aggregating millions of metrics. You can also do geo-spatial analysis on a map.

We will build a metrics analytics application using Elastic Stack in Chapter 9, Building a Sensor Data Analytics Application.

Web search and website search

Elasticsearch can serve as a search engine for your website and perform a Google-like search across the entire contents of your site. GitHub, Wikipedia, and many other platforms power their searches using Elasticsearch.

Elasticsearch can be leveraged to build content aggregation platforms. What is a content aggregator or a content aggregation platform? Content aggregators scrape/crawl multiple websites, index the web pages, and provide a search functionality on the underlying content. This is a powerful way to build domain specific aggregated platforms. 

Apache Nutch, an open source, large scale web crawler, was created by Doug Cutting, the original creator of Apache Lucene. Apache Nutch crawls the web, parses the HTML pages, stores them, and also builds indexes to make the content searchable. Apache Nutch supports indexing into Elasticsearch or Apache Solr for its search engine.

As it is evident, Elasticsearch and Elastic Stack have many practical use cases. Elastic Stack is a platform with a complete set of tools to build end-to-end search and analytics solutions. It is a very approachable platform for developers, architects, business intelligence analysts, and system administrators. It is possible to put together an Elastic Stack solution with almost zero coding and with only configuration. At the same time, Elasticsearch is very customizable, that is, developers and programmers can build powerful applications using its rich programming language support and the REST API.