Lessons Learned
This chapter strives to answer questions of the genre “how different is different.” Such questions necessarily bring up the subject of statistics, which has been studying ways to answer such questions for almost two centuries.
The normal distribution, which is defined by an average and a standard deviation, is very important in statistics. Measuring how far a value is from the average, in terms of standard deviations, is the z-score. Large z-scores (regardless of sign) have a very low confidence. That is, the value is probably not produced by a random process, so something is happening.
Counts are very important in customer databases. There are three approaches to determining whether counts for different groups are the same or different. The binomial distribution counts every possible combination, so it is quite precise. The standard error of proportions is useful for getting z-scores. And, the chi-square test directly compares counts across multiple dimensions...