Book Image

Modern Distributed Tracing in .NET

By : Liudmila Molkova
Book Image

Modern Distributed Tracing in .NET

By: Liudmila Molkova

Overview of this book

As distributed systems become more complex and dynamic, their observability needs to grow to aid the development of holistic solutions for performance or usage analysis and debugging. Distributed tracing brings structure, correlation, causation, and consistency to your telemetry, thus allowing you to answer arbitrary questions about your system and creating a foundation for observability vendors to build visualizations and analytics. Modern Distributed Tracing in .NET is your comprehensive guide to observability that focuses on tracing and performance analysis using a combination of telemetry signals and diagnostic tools. You'll begin by learning how to instrument your apps automatically as well as manually in a vendor-neutral way. Next, you’ll explore how to produce useful traces and metrics for typical cloud patterns and get insights into your system and investigate functional, configurational, and performance issues. The book is filled with instrumentation examples that help you grasp how to enrich auto-generated telemetry or produce your own to get the level of detail your system needs, along with controlling your costs with sampling, aggregation, and verbosity. By the end of this book, you'll be ready to adopt and leverage tracing and other observability signals and tools and tailor them to your needs as your system evolves.
Table of Contents (23 chapters)
1
Part 1: Introducing Distributed Tracing
6
Part 2: Instrumenting .NET Applications
11
Part 3: Observability for Common Cloud Scenarios
16
Part 4: Implementing Distributed Tracing in Your Organization

What this book covers

Chapter 1, Observability Needs of Modern Applications, provides an overview of common monitoring techniques and introduces distributed tracing. It covers OpenTelemetry – a vendor-agnostic telemetry platform and shows how it addresses observability challenges of distributed applications with correlated telemetry signals.

Chapter 2, Native Monitoring in .NET, offers an overview of the diagnostic capabilities provided by .NET out-of-the-box. These capabilities include structured and correlated logs and counters along with ad-hoc monitoring with the dotnet-monitor tool. We’ll also instrument the first application with OpenTelemetry and get hands-on experience with distributed tracing.

Chapter 3, The .NET Observability Ecosystem, explores a broader set of tracing instrumentations and environments. We’ll learn how to find and evaluate instrumentation libraries, get traces from infrastructure such as Dapr, and finally instrument serverless applications using AWS Lambda and Azure Functions as examples.

Chapter 4, Low-Level Performance Analysis with Diagnostic Tools, provides an introduction into lower-level .NET diagnostics and performance analysis. We’ll see how to collect and analyze runtime counters and performance traces to get more observability within the process when distributed tracing does not provide enough input.

Chapter 5, Configuration and Control Plane, provides an overview of OpenTelemetry configuration and customization. We’ll explore different sampling strategies and learn how to enrich and filter spans or customize metrics collection. Finally, we’ll introduce OpenTelemetry Collector – an agent that can take care of many telemetry post-processing tasks.

Chapter 6, Tracing Your Code, dives into tracing instrumentation with .NET tracing APIs or OpenTelemetry shim. Here, we’ll learn about the Activity and ActivitySource classes used to collect spans, show how to leverage ambient context propagation within the process, and record events and exceptions. We’ll also cover integration testing for your instrumentation code.

Chapter 7, Adding Custom Metrics, delves into the modern .NET metrics API. You’ll learn about available instruments - counters, gauges, and histograms used to aggregate measurements in different ways and get hands-on experience implementing and using metrics to monitor system health or to investigate performance issues.

Chapter 8, Writing Structured and Correlated Logs, provides an overview of logging in .NET focusing on Microsoft.Extension.Logging. We’ll learn to write structured and queryable logs efficiently and collect them with OpenTelemetry. We’ll also look into managing logging costs using OpenTelemetry Collector.

Chapter 9, Best Practices, provides guidance on choosing most suitable telemetry signals depending on application needs and scenarios, and shows how to control telemetry costs with minimal impact on observability. It also introduces OpenTelemetry semantic conventions – telemetry collection recipes for common patterns and technologies.

Chapter 10, Tracing Network Calls, explores network call instrumentation using gRPC as an example. We’ll learn how to instrument simple request-response calls following RPC semantic conventions and propagate context. We’ll also cover challenges and possible solutions when instrumenting streaming calls.

Chapter 11, Instrumenting Messaging Scenarios, explores instrumentation for asynchronous processing scenarios. We’ll learn how to trace messages end-to-end, instrument batching scenarios, and introduce messaging-specific metrics allowing to detect scaling and performance issues.

Chapter 12, Instrumenting Database Calls, explores database and cache instrumentation with tracing and metrics. We’ll also cover forwarding external metrics from a Redis instance into our observability backend and use the collected telemetry for performance analysis and caching strategy optimization.

Chapter 13, Driving Change, covers organizational and planning aspects related to observability improvements. We’ll discuss the cost of low observability and suggest several ways to measure them. We’ll come up with an onboarding plan, talk about common pitfalls, and see how to benefit from better observability in daily development tasks.

Chapter 14, Creating Your Own Conventions, provides suggestions on how to collect telemetry consistently across the system starting with a unified OpenTelemetry configuration. We’ll also learn to define custom semantic conventions and implement them in shared code, making it easy to follow them.

Chapter 15, Instrumenting Brownfield Applications, discusses challenges with instrumenting newer part of the system in presence of legacy services. We’ll suggest solutions that can minimize changes to legacy components and learn to leverage legacy correlation propagation formats, implement minimalistic pass-through context propagation, and forward telemetry from legacy services to the new backend.