Chapter 8. All About Sampling
The gathering of monitoring data in production is always a compromise between the costs, in terms of storage and performance overhead, and the expressiveness of the collected data. The more data we collect, the better we hope to be able to diagnose the situation, should something go wrong, yet we don't want to slow down the applications or pay exorbitant bills for storage. Even though most logging frameworks support multiple levels of log severity, a common wisdom is to tune the loggers in production to discard anything logged with the debug level or lower. Many organizations even adopt the rule that successful requests should leave no logs at all, and you only log when there is some issue with the request.
Distributed tracing is not immune to this compromise either. Depending on the verbosity of the instrumentation, tracing data can easily exceed the volume of the actual business traffic sustained by an application. Collecting all that data in memory, and sending...