Observability with Grafana

By : Rob Chapman, Peter Holmes

Observability with Grafana

By: Rob Chapman, Peter Holmes

Overview of this book

To overcome application monitoring and observability challenges, Grafana Labs offers a modern, highly scalable, cost-effective Loki, Grafana, Tempo, and Mimir (LGTM) stack along with Prometheus for the collection, visualization, and storage of telemetry data. Beginning with an overview of observability concepts, this book teaches you how to instrument code and monitor systems in practice using standard protocols and Grafana libraries. As you progress, you’ll create a free Grafana cloud instance and deploy a demo application to a Kubernetes cluster to delve into the implementation of the LGTM stack. You’ll learn how to connect Grafana Cloud to AWS, GCP, and Azure to collect infrastructure data, build interactive dashboards, make use of service level indicators and objectives to produce great alerts, and leverage the AI & ML capabilities to keep your systems healthy. You’ll also explore real user monitoring with Faro and performance monitoring with Pyroscope and k6. Advanced concepts like architecting a Grafana installation, using automation and infrastructure as code tools for DevOps processes, troubleshooting strategies, and best practices to avoid common pitfalls will also be covered. After reading this book, you’ll be able to use the Grafana stack to deliver amazing operational results for the systems your organization uses.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download a free PDF copy of this book

Part 1: Get Started with Grafana and Observability

Free Chapter

Chapter 1: Introducing Observability and the Grafana Stack

Observability in a nutshell

Telemetry types and technologies

Introducing the user personas of observers

Introducing the Grafana stack

Alternatives to the Grafana stack

Deploying the Grafana stack

Summary

Chapter 2: Instrumenting Applications and Infrastructure

Common log formats

Exploring metric types and best practices

Tracing protocols and best practices

Using libraries to instrument efficiently

Infrastructure data technologies

Summary

Chapter 3: Setting Up a Learning Environment with Demo Applications

Technical requirements

Introducing Grafana Cloud

Installing the prerequisite tools

Installing the OpenTelemetry Demo application

Exploring telemetry from the demo application

Troubleshooting your OpenTelemetry Demo installation

Summary

Part 2: Implement Telemetry in Grafana

Chapter 4: Looking at Logs with Grafana Loki

Technical requirements

Updating the OpenTelemetry demo application

Introducing Loki

Understanding LogQL

Exploring Loki’s architecture

Tips, tricks, and best practices

Summary

Chapter 5: Monitoring with Metrics Using Grafana Mimir and Prometheus

Technical requirements

Updating the OpenTelemetry demo application

Introducing PromQL

Exploring data collection and metric protocols

Understanding data storage architectures

Using exemplars in Grafana

Summary

Chapter 6: Tracing Technicalities with Grafana Tempo

Technical requirements

Updating the OpenTelemetry Demo application

Introducing Tempo and the TraceQL query language

Exploring tracing protocols

Understanding the Tempo architecture

Summary

Chapter 7: Interrogating Infrastructure with Kubernetes, AWS, GCP, and Azure

Technical requirements

Monitoring Kubernetes using Grafana

Visualizing AWS telemetry with Grafana Cloud

Monitoring GCP using Grafana

Monitoring Azure using Grafana

Best practices and approaches

Summary

Part 3: Grafana in Practice

Chapter 8: Displaying Data with Dashboards

Technical requirements

Creating your first dashboard

Developing your dashboard further

Using visualizations in Grafana

Developing a dashboard purpose

Advanced dashboard techniques

Managing and organizing dashboards

Case study – an overall system view

Summary

Chapter 9: Managing Incidents Using Alerts

Technical requirements

Being alerted versus being alarmed

Writing great alerts using SLIs and SLOs

Grafana Alerting

Grafana OnCall

Grafana Incident

Summary

Chapter 10: Automation with Infrastructure as Code

Technical requirements

Benefits of automating Grafana

Introducing the components of observability systems

Automating collection infrastructure with Helm or Ansible

Getting to grips with the Grafana API

Managing dashboards and alerts with Terraform or Ansible

Summary

Chapter 11: Architecting an Observability Platform

Architecting your observability platform

Developing a proof of concept

Setting the right access levels

Sending telemetry to other consumers

Summary

Part 4: Advanced Applications and Best Practices of Grafana

Chapter 12: Real User Monitoring with Grafana

Introducing RUM

Setting up Grafana Frontend Observability

Exploring Web Vitals

Pivoting from frontend to backend data

Enhancements and custom configurations

Summary

Chapter 13: Application Performance with Grafana Pyroscope and k6

Using Pyroscope for continuous profiling

Using k6 for load testing

Summary

Chapter 14: Supporting DevOps Processes with Observability

Introducing the DevOps life cycle

Using Grafana for fast feedback during the development life cycle

Using Grafana to monitor infrastructure and platforms

Summary

Chapter 15: Troubleshooting, Implementing Best Practices, and More with Grafana

Best practices and troubleshooting for data collection

Best practices and troubleshooting for the Grafana stack

Avoiding pitfalls of observability

Future trends in application monitoring

Summary

Index

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Download a free PDF copy of this book

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Introducing Observability and the Grafana Stack

The modern computer systems we work with have moved from the realm of complicated into the realm of complex, where the number of interacting variables make them ultimately unknowable and uncontrollable. We are using the terms complicated and complex as per system theory. A complicated system, like an engine, has clear causal relationships between components. A complex system, such as the flowing of traffic in a city, shows emergent behavior from the interactions of its components.

With the average cost of downtime estimated to be $9,000 per minute by Ponemon Institute in 2016, this complexity can cause significant financial loss if organizations do not take steps to manage this risk. Observability offers a way to mitigate these risks, but making systems observable comes with its own financial risks if implemented poorly or without a clear business goal.

In this book, we will give you a good understanding of what observability is and who the customers who might use it are. We will explore how to use the tools available from Grafana Labs to gain visibility of your organization. These tools include Loki, Prometheus, Mimir, Tempo, Frontend Observability, Pyroscope, and k6. You will learn how to use Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to obtain clear transparent signals of when a service is operating correctly, and how to use the Grafana incident response tools to handle incidents. Finally, you will learn about managing their observability platform using automation tools such as Ansible, Terraform, and Helm.

This chapter aims to introduce observability to all audiences, using examples outside of the computing world. We’ll introduce the types of telemetry used by observability tools, which will give you an overview of how to use them to quickly understand the state of your services. The various personas who might use observability systems will be outlined so that you can explore complex ideas later with a clear grounding on who will benefit from their correct implementation. Finally, we’ll investigate Grafana’s Loki, Grafana, Tempo, Mimir (LGTM) stack, how to deploy it, and what alternatives exist.

In this chapter, we’re going to cover the following main topics:

Observability in a nutshell
Telemetry types and technologies
Understanding the customers of observability
Introducing the Grafana stack
Alternatives to the Grafana stack
Deploying the Grafana stack

Observability with Grafana

By : Rob Chapman, Peter Holmes

Observability with Grafana

By: Rob Chapman, Peter Holmes

Overview of this book

Related Content you might be interested in

Current Title:

Observability with Grafana

Cloud-Native Observability with OpenTelemetry

Learn Grafana 10.x

Implementing Enterprise Observability for Success

Introducing Observability and the Grafana Stack