Distributed tracing, also known as end-to-end or workflow-centric tracing, is a family of techniques that aim to capture the detailed execution of causally-related activities, performed by the components of a distributed system. Unlike the traditional code profilers or host-level tracing tools, such as dtrace [1], end-to-end tracing is primarily focused on profiling the individual executions cooperatively performed by many different processes, usually running on many different hosts, which is typical of modern, cloud-native, microservices-based applications.
In the previous chapter, we saw a tracing system in action from the end user perspective. In this chapter, we will discuss the basic underlying ideas of distributed tracing, various approaches that have been presented in the industry, academic works for implementing end-to-end tracing; the impact and trade-offs of the architectural decisions taken by different tracing systems on their capabilities...