Learning Kibana 7 - Second Edition

By : Anurag Srivastava, Bahaaldine Azarmi

Learning Kibana 7 - Second Edition

By: Anurag Srivastava, Bahaaldine Azarmi

Overview of this book

Kibana is a window into the Elastic Stack that enables the visual exploration and real-time analysis of your data in Elasticsearch. This book will help you understand how you can use Kibana 7 for rich analytics and data visualization. If you’re new to the tool or want to get to grips with the latest features introduced in Kibana 7, this book is the perfect beginner's guide. You’ll learn how to set up and configure the Elastic Stack and understand where Kibana sits within the architecture. As you advance, you’ll learn how to ingest data from different sources using Beats or Logstash into Elasticsearch, followed by exploring and visualizing data in Kibana. Whether working with time-series data to create complex graphs using Timelion or embedding visualizations created in Kibana into your web applications, this book covers it all. It also covers topics that every Elastic developer needs to be aware of, such as installing and configuring application performance monitoring (APM) servers and agents. Finally, you’ll also learn how to create effective machine learning jobs in Kibana to find anomalies in your data. By the end of this book, you’ll have a solid understanding of Kibana, and be able to create your own visual analytics solutions from scratch.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

Section 1: Understanding Kibana 7

Understanding Your Data for Kibana

Industry challenges

Understanding your data for analysis in Kibana

Technology limitations

Components of the Elastic Stack

Summary

Installing and Setting Up Kibana

Installing Elasticsearch

Installing Kibana

Installing Logstash

Installing Beats

Summary

Section 2: Exploring the Data

Business Analytics with Kibana

Understanding logs

Data modeling

Importing data

Creating an index pattern

Summary

Visualizing Data Using Kibana

Creating visualizations in Kibana

Creating dashboards in Kibana

Summary

Section 3: Tools for Playing with Your Data

Dev Tools and Timelion

Introducing Dev Tools

Timelion

Summary

Space and Graph Exploration in Kibana

Kibana spaces

Kibana graphs

Summary

Section 4: Advanced Kibana Options

Elastic Stack Features

Summary

Kibana Canvas and Plugins

Kibana Canvas

Kibana plugins

Summary

Application Performance Monitoring

APM components

Configuring an application with APM

Summary

Machine Learning with Kibana

What is Elastic machine learning?

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Understanding your data for analysis in Kibana

Here, we will discuss different aspects of data analysis such as data shipping, data ingestion, data storage, and data visualization. These are all very important aspects of data analysis and visualization, and we need to understand each of them in detail. The objective is to then understand how to avoid any confusion, and build an architecture that will serve the different following aspects.

Data shipping

Data-shipping architecture should support any sort of data or event transport that is either structured or unstructured. The primary goal of data shipping is to send data from remote machines to a centralized location in order to make it available for further exploration. For data shipping, we generally deploy lightweight agents that sit on the same server from where we want to get the data. These shippers fetch the data and keep on sending them to the centralized server. For data shipping, we need to consider the following:

The agents should be lightweight. They should not take resources with the process that generates the actual data, in order to minimize the performance impact and place fewer footprints on it.
There are a lot of data shipping technologies out there; some of them are tied to a specific technology, while others are based on an extensible framework that can adapt relatively to a data source.
Shipping data is not only about sending data over the wire; in fact, it's also about security and making sure that the data is sent to the proper destination with an end-to-end secured pipeline.
Another aspect of data shipping is the management of data loads. Shipping data should be done relative to the load that the end destination is able to ingest; this feature is called back pressure management.

It's essential for data visualization to rely on reliable data shipping. As an example, consider data flowing from financial trade machines and how critical it could be not to be able to detect a security leak just because you are losing data.

Data ingestion

The scope of an ingestion layer is to receive data, encompassing as wide a range of commonly used transport protocols and data formats as possible, while providing capabilities to extract and transform this data before finally storing it.

Processing data can somehow be seen as extracting, transforming, and loading (ETL) data, which is often called an ingestion pipeline and, essentially, receives data from the shipping layer to push it to a storage layer. It comes with the following features:

Generally, the ingestion layer has a pluggable architecture to ease integration with the various sources of data and destinations, with the help of a set of plugins. Some of the plugins are made for receiving data from shippers, which means that data is not always received from shippers and can come directly from a data source such as a file, a network, or even a database. It can be ambiguous in some cases: should I use a shipper or a pipeline to ingest data from the file? It will, of course, depend on the use case and also on the expected SLAs.
The ingestion layer should be used to prepare the data by, for example, parsing the data, formatting the data, doing the correlation with other data sources, and normalizing and enriching the data before storage. This has many advantages, but the most important advantage is that it can improve the quality of the data, providing better insights for visualization. Another advantage could be to remove processing overheads later on, by precomputing a value or looking up a reference. The drawback of this is that you may need to ingest the data again if the data is not properly formatted or enriched for visualization. Hopefully, there are some ways to process the data after it has been ingested.
Ingesting and transforming data consumes compute resources. It is essential that we consider this, usually in terms of maximum data throughput per unit, and plan for ingestion by distributing the load over multiple ingestion instances. This is a very important aspect of real-time visualization, or, to be precise, near real-time visualization. If ingestion is spread across multiple instances, it can accelerate the storage of the data and, therefore, make it available faster for visualization.

Storing data at scale

Storage is undoubtedly the masterpiece of the data-driven architecture. It provides the essential, long-term retention of your data. It also provides the core functionality to search, analyze, and discover insights in your data. It is the heart of the process. The action will depend on the nature of the technology. Here are some aspects that the storage layer usually brings:

Scalability is the main aspect, that is, the storage used for various volumes of data that could start from gigabytes to terabytes to petabytes of data. The scalability is horizontal, which means that, as demand and volume grow, you should be able to increase the capacity of the storage seamlessly by adding more machines.
Most of the time, a non-relational and highly distributed data store, which allows fast data access and analysis at a high volume and on a variety of data types, is used, namely, a NoSQL data store. Data is partitioned and spread over a set of machines in order to balance the load while reading or writing data.
For data visualization, it's essential that the storage exposes an API to make analysis on top of the data. Letting the visualization layer do the statistical analysis, such as grouping data over a given dimension (aggregation), wouldn't
scale.
The nature of the API can depend on the expectation of the visualization layer, but most of the time it's about aggregations. The visualization should only render the result of the heavy lifting done at the storage level.
A data-driven architecture can serve data to a lot of different applications and users, and for different levels of SLAs. High availability becomes the norm in such architectures, and, like scalability, it should be part of the nature of the solution.

Visualizing data

The visualization layer is the window on the data. It provides a set of tools to build live graphs and charts to bring the data to life, allowing you to build rich, insightful dashboards that answer the questions: What is happening now? Is my business healthy? What is the mood of the market?

The visualization layer in a data-driven architecture is a layer where we expect the majority of the data consumption and is mostly focused on bringing KPIs on top of stored data. It comes with the following essential features:

It should be lightweight and only render the result of the processing done in the storage layer
It allows the user to discover the data and get quick out-of-the-box insights on the data
It offers a visual way to ask unexpected questions to the data, rather than having to implement the proper request to do that
In modern data architectures that must address the needs of accessing KPIs as fast as possible, the visualization layer should render the data in near real time
The visualization framework should be extensible and allow users to customize the existing assets or to add new features depending on the need
The user should be able to share the dashboards outside of the visualization application

As you can see, it's not only a matter of visualization. You need some foundations to reach the objectives. This is how we'll address the use of Kibana in this book: we'll focus on use cases and see what is the best way to leverage Kibana features, depending on the use case and context.

The main differentiator from the other visualization tools is that Kibana comes alongside a full stack, the Elastic Stack, with seamless integration with every layer of the stack, which just eases the deployment of such architecture. There are a lot of other technologies out there; we'll now explore what they are good at and what their limits are.

Learning Kibana 7 - Second Edition

By : Anurag Srivastava, Bahaaldine Azarmi

Learning Kibana 7 - Second Edition

By: Anurag Srivastava, Bahaaldine Azarmi

Overview of this book

Related Content you might be interested in

Current Title:

Learning Kibana 7 - Second Edition

Learning Kibana 5.0

Elasticsearch 7 Quick Start Guide

Learning Elastic Stack 7.0