Book Image

Clojure for Data Science

By : Henry Garner
Book Image

Clojure for Data Science

By: Henry Garner

Overview of this book

Table of Contents (18 chapters)
Clojure for Data Science
Credits
About the Author
Acknowledgments
About the Reviewer
www.PacktPub.com
Preface
Index

Performing the t-test


The difference in the way t-test works stems from the probability distribution from which our p-value is calculated. Having calculated our t-statistic, we need to look up the value in the t-distribution parameterized by the degrees of freedom of our data:

(defn t-test [a b]
  (let [df (+ (count a) (count b) -2)]
    (- 1 (s/cdf-t (i/abs (t-stat a b)) :df df))))

The degrees of freedom are two less than the sizes of the samples combined, which is 298 for our samples.

Recall that we are performing a hypothesis test. So, let's state our null and alternate hypotheses:

  • H0: This sample is drawn from a population with a supplied mean

  • H1: This sample is drawn from a population with a greater mean

Let's run the example:

(defn ex-2-16 []
  (let [data (->> (load-data "new-site.tsv")
                  (:rows)
                  (group-by :site)
                  (map-vals (partial map :dwell-time)))
        a (get data 0)
        b (get data 1)]
    (t-test a b)))

;; 0.0503

This...