Book Image

Observability with Grafana

By : Rob Chapman, Peter Holmes
Book Image

Observability with Grafana

By: Rob Chapman, Peter Holmes

Overview of this book

To overcome application monitoring and observability challenges, Grafana Labs offers a modern, highly scalable, cost-effective Loki, Grafana, Tempo, and Mimir (LGTM) stack along with Prometheus for the collection, visualization, and storage of telemetry data. Beginning with an overview of observability concepts, this book teaches you how to instrument code and monitor systems in practice using standard protocols and Grafana libraries. As you progress, you’ll create a free Grafana cloud instance and deploy a demo application to a Kubernetes cluster to delve into the implementation of the LGTM stack. You’ll learn how to connect Grafana Cloud to AWS, GCP, and Azure to collect infrastructure data, build interactive dashboards, make use of service level indicators and objectives to produce great alerts, and leverage the AI & ML capabilities to keep your systems healthy. You’ll also explore real user monitoring with Faro and performance monitoring with Pyroscope and k6. Advanced concepts like architecting a Grafana installation, using automation and infrastructure as code tools for DevOps processes, troubleshooting strategies, and best practices to avoid common pitfalls will also be covered. After reading this book, you’ll be able to use the Grafana stack to deliver amazing operational results for the systems your organization uses.
Table of Contents (22 chapters)
1
Part 1: Get Started with Grafana and Observability
5
Part 2: Implement Telemetry in Grafana
10
Part 3: Grafana in Practice
15
Part 4: Advanced Applications and Best Practices of Grafana

Introducing Observability and the Grafana Stack

The modern computer systems we work with have moved from the realm of complicated into the realm of complex, where the number of interacting variables make them ultimately unknowable and uncontrollable. We are using the terms complicated and complex as per system theory. A complicated system, like an engine, has clear causal relationships between components. A complex system, such as the flowing of traffic in a city, shows emergent behavior from the interactions of its components.

With the average cost of downtime estimated to be $9,000 per minute by Ponemon Institute in 2016, this complexity can cause significant financial loss if organizations do not take steps to manage this risk. Observability offers a way to mitigate these risks, but making systems observable comes with its own financial risks if implemented poorly or without a clear business goal.

In this book, we will give you a good understanding of what observability is and who the customers who might use it are. We will explore how to use the tools available from Grafana Labs to gain visibility of your organization. These tools include Loki, Prometheus, Mimir, Tempo, Frontend Observability, Pyroscope, and k6. You will learn how to use Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to obtain clear transparent signals of when a service is operating correctly, and how to use the Grafana incident response tools to handle incidents. Finally, you will learn about managing their observability platform using automation tools such as Ansible, Terraform, and Helm.

This chapter aims to introduce observability to all audiences, using examples outside of the computing world. We’ll introduce the types of telemetry used by observability tools, which will give you an overview of how to use them to quickly understand the state of your services. The various personas who might use observability systems will be outlined so that you can explore complex ideas later with a clear grounding on who will benefit from their correct implementation. Finally, we’ll investigate Grafana’s Loki, Grafana, Tempo, Mimir (LGTM) stack, how to deploy it, and what alternatives exist.

In this chapter, we’re going to cover the following main topics:

  • Observability in a nutshell
  • Telemetry types and technologies
  • Understanding the customers of observability
  • Introducing the Grafana stack
  • Alternatives to the Grafana stack
  • Deploying the Grafana stack