Book Image

Mastering Clojure Data Analysis

By : Eric Richard Rochester
Book Image

Mastering Clojure Data Analysis

By: Eric Richard Rochester

Overview of this book

Table of Contents (17 chapters)
Mastering Clojure Data Analysis
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Failing Benford's Law


So far, we've seen several datasets, all of which conform to Benford's Law, most of them quite strongly. We haven't yet seen a dataset that does not conform to this distribution of initial digits. What would a failing dataset look like?

There are many ways in which we could get data that doesn't conform. Any linear data, for example, would have a more uniform distribution of the initial digits. However, we can also simulate fraudulent data easily, and in the process, we can learn just how much noise a dataset can handle before Benford's Law begins to have trouble with it.

We'll start this experiment with the population data that we looked at earlier. We'll progressively introduce more and more junk into the dataset. We'll randomly replace items in the dataset with a random value and re-run incanter.stats/benford-test on it. When it finally fails, we can note how many items we've replaced and how far off the new distribution is.

The primary function is shown as follows...