Book Image

Clojure for Data Science

By : Henry Garner
Book Image

Clojure for Data Science

By: Henry Garner

Overview of this book

Table of Contents (18 chapters)
Clojure for Data Science
Credits
About the Author
Acknowledgments
About the Reviewer
www.PacktPub.com
Preface
Index

Multiple comparisons


The fact that with repeated trials, we increase the probability of discovering a significant effect is called the multiple comparisons problem. In general, the solution to the problem is to demand more significant effects when comparing many samples. There is no straightforward solution to this issue though; even with an α of 0.01, we will make a Type I error on an average of 1 percent of the time.

To develop our intuition about how multiple comparisons and statistical significance relate to each other, let's build an interactive web page to simulate the effect of taking multiple samples. It's one of the advantages of using a powerful and general-purpose programming language like Clojure for data analysis that we can run our data processing code in a diverse array of environments.

The code we've written and run so far for this chapter has been compiled for the Java Virtual Machine. But since 2013, there has been an alternative target environment for our compiled code:...