In the previous chapter, when discussing visual representations of numerical data, we introduced histograms, which represent the way the data is distributed across a number of intervals. One of the drawbacks of histograms is that the number of bins is always chosen somewhat arbitrarily, and incorrect choices may give useless or misleading information about the distribution of the data.
We say that histograms abstract some of the characteristics of the data. That is, a histogram allows us to ignore some of the fine-grained variability in the data so that general patterns are more apparent.
Abstraction is, in general, a good thing when analyzing a dataset but we would like to have an accurate representation of all data points that is visually compelling and computationally useful. This is provided by the cumulative distribution function. This function has always been important for statistical computations, and cumulative distribution tables were in fact an...