Book Image

Observability with Grafana

By : Rob Chapman, Peter Holmes

Book Image

Observability with Grafana

By: Rob Chapman, Peter Holmes

Overview of this book

To overcome application monitoring and observability challenges, Grafana Labs offers a modern, highly scalable, cost-effective Loki, Grafana, Tempo, and Mimir (LGTM) stack along with Prometheus for the collection, visualization, and storage of telemetry data. Beginning with an overview of observability concepts, this book teaches you how to instrument code and monitor systems in practice using standard protocols and Grafana libraries. As you progress, you’ll create a free Grafana cloud instance and deploy a demo application to a Kubernetes cluster to delve into the implementation of the LGTM stack. You’ll learn how to connect Grafana Cloud to AWS, GCP, and Azure to collect infrastructure data, build interactive dashboards, make use of service level indicators and objectives to produce great alerts, and leverage the AI & ML capabilities to keep your systems healthy. You’ll also explore real user monitoring with Faro and performance monitoring with Pyroscope and k6. Advanced concepts like architecting a Grafana installation, using automation and infrastructure as code tools for DevOps processes, troubleshooting strategies, and best practices to avoid common pitfalls will also be covered. After reading this book, you’ll be able to use the Grafana stack to deliver amazing operational results for the systems your organization uses.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Share Your Thoughts

Download a free PDF copy of this book

Part 1: Get Started with Grafana and Observability

Part 1: Get Started with Grafana and Observability

Free Chapter

Chapter 1: Introducing Observability and the Grafana Stack

Chapter 1: Introducing Observability and the Grafana Stack

Observability in a nutshell

Telemetry types and technologies

Introducing the user personas of observers

Introducing the Grafana stack

Alternatives to the Grafana stack

Deploying the Grafana stack

Chapter 2: Instrumenting Applications and Infrastructure

Chapter 2: Instrumenting Applications and Infrastructure

Common log formats

Exploring metric types and best practices

Tracing protocols and best practices

Using libraries to instrument efficiently

Infrastructure data technologies

Chapter 3: Setting Up a Learning Environment with Demo Applications

Chapter 3: Setting Up a Learning Environment with Demo Applications

Technical requirements

Introducing Grafana Cloud

Installing the prerequisite tools

Installing the OpenTelemetry Demo application

Exploring telemetry from the demo application

Troubleshooting your OpenTelemetry Demo installation

Part 2: Implement Telemetry in Grafana

Part 2: Implement Telemetry in Grafana

Chapter 4: Looking at Logs with Grafana Loki

Chapter 4: Looking at Logs with Grafana Loki

Technical requirements

Updating the OpenTelemetry demo application

Introducing Loki

Understanding LogQL

Exploring Loki’s architecture

Tips, tricks, and best practices

Chapter 5: Monitoring with Metrics Using Grafana Mimir and Prometheus

Chapter 5: Monitoring with Metrics Using Grafana Mimir and Prometheus

Technical requirements

Updating the OpenTelemetry demo application

Introducing PromQL

Exploring data collection and metric protocols

Understanding data storage architectures

Using exemplars in Grafana

Chapter 6: Tracing Technicalities with Grafana Tempo

Chapter 6: Tracing Technicalities with Grafana Tempo

Technical requirements

Updating the OpenTelemetry Demo application

Introducing Tempo and the TraceQL query language

Exploring tracing protocols

Understanding the Tempo architecture

Chapter 7: Interrogating Infrastructure with Kubernetes, AWS, GCP, and Azure

Chapter 7: Interrogating Infrastructure with Kubernetes, AWS, GCP, and Azure

Technical requirements

Monitoring Kubernetes using Grafana

Visualizing AWS telemetry with Grafana Cloud

Monitoring GCP using Grafana

Monitoring Azure using Grafana

Best practices and approaches

Part 3: Grafana in Practice

Part 3: Grafana in Practice

Chapter 8: Displaying Data with Dashboards

Chapter 8: Displaying Data with Dashboards

Technical requirements

Creating your first dashboard

Developing your dashboard further

Using visualizations in Grafana

Developing a dashboard purpose

Advanced dashboard techniques

Managing and organizing dashboards

Case study – an overall system view

Chapter 9: Managing Incidents Using Alerts

Chapter 9: Managing Incidents Using Alerts

Technical requirements

Being alerted versus being alarmed

Writing great alerts using SLIs and SLOs

Grafana Alerting

Grafana Incident

Chapter 10: Automation with Infrastructure as Code

Chapter 10: Automation with Infrastructure as Code

Technical requirements

Benefits of automating Grafana

Introducing the components of observability systems

Automating collection infrastructure with Helm or Ansible

Getting to grips with the Grafana API

Managing dashboards and alerts with Terraform or Ansible

Chapter 11: Architecting an Observability Platform

Chapter 11: Architecting an Observability Platform

Architecting your observability platform

Developing a proof of concept

Setting the right access levels

Sending telemetry to other consumers

Part 4: Advanced Applications and Best Practices of Grafana

Part 4: Advanced Applications and Best Practices of Grafana

Chapter 12: Real User Monitoring with Grafana

Chapter 12: Real User Monitoring with Grafana

Introducing RUM

Setting up Grafana Frontend Observability

Exploring Web Vitals

Pivoting from frontend to backend data

Enhancements and custom configurations

Chapter 13: Application Performance with Grafana Pyroscope and k6

Chapter 13: Application Performance with Grafana Pyroscope and k6

Using Pyroscope for continuous profiling

Using k6 for load testing

Chapter 14: Supporting DevOps Processes with Observability

Chapter 14: Supporting DevOps Processes with Observability

Introducing the DevOps life cycle

Using Grafana for fast feedback during the development life cycle

Using Grafana to monitor infrastructure and platforms

Chapter 15: Troubleshooting, Implementing Best Practices, and More with Grafana

Chapter 15: Troubleshooting, Implementing Best Practices, and More with Grafana

Best practices and troubleshooting for data collection

Best practices and troubleshooting for the Grafana stack

Avoiding pitfalls of observability

Future trends in application monitoring

Index

Other Books You May Enjoy

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Download a free PDF copy of this book

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Introducing Tempo and the TraceQL query language

Tempo and TraceQL are the newest of the tools and query languages we will explore in depth in this book. Like LogQL, TraceQL was built using PromQL as an inspiration and offers developers and operators a familiar set of filtering, aggregation, and mathematical tools that aid in the observability flow between metrics, logs, and traces.

Let’s have a quick look at how Tempo sees trace data:

Trace collection: Introduced in Chapter 2, a trace (or distributed trace) is a collection of data that represents a request propagating through a system. Traces are often collected from multiple applications. Spans are sent by each application to some form of collection architecture and, ultimately, to Tempo for storage and querying.
Trace fields: The following diagram introduces a simplified structure of a trace, similar to the simplified structure of logs, seen in Chapter 4, and traces, seen in Chapter 5:

Figure 6.1 – A simplified view of a trace containing four spans

...