Book Image

Clojure for Data Science

By : Henry Garner
Book Image

Clojure for Data Science

By: Henry Garner

Overview of this book

Table of Contents (18 chapters)
Clojure for Data Science
Credits
About the Author
Acknowledgments
About the Reviewer
www.PacktPub.com
Preface
Index

Visualizing the dwell times


We can plot a histogram of dwell times by simply extracting the :dwell-time column with i/$:

(defn ex-2-2 []
  (-> (i/$ :dwell-time (load-data "dwell-times.tsv"))
      (c/histogram :x-label "Dwell time (s)"
                   :nbins 50)
      (i/view)))

The earlier code generates the following histogram:

This is clearly not a normally distributed data, nor even a very skewed normal distribution. There is no tail to the left of the peak (a visitor clearly can't be on our site for less than zero seconds). While the data tails off steeply to the right at first, it extends much further along the x axis than we would expect from normally distributed data.

When confronted with distributions like this, where values are mostly small but occasionally extreme, it can be useful to plot the y axis as a log scale. Log scales are used to represent events that cover a very large range. Chart axes are ordinarily linear and they partition a range into equally sized steps like...