Book Image

Clojure for Data Science

By : Henry Garner
Book Image

Clojure for Data Science

By: Henry Garner

Overview of this book

Table of Contents (18 chapters)
Clojure for Data Science
Credits
About the Author
Acknowledgments
About the Reviewer
www.PacktPub.com
Preface
Index

The curse of dimensionality


There is one fact that the Mahalanobis distance measure is unable to overcome, though, and this is known as the curse of dimensionality. As the number of dimensions in a dataset rises, every point tends to become equally far from every other point. We can demonstrate this quite simply with the following code:

(defn ex-6-27 []
  (let [distances (for [d (range 2 100)
                        :let [data (->> (dataset-of-dimension d)
                                        (s/mahalanobis-distance)
                                        (map first))]]
                    [(apply min data) (apply max data)])]
    (-> (c/xy-plot (range 2 101) (map first distances)
                   :x-label "Number of Dimensions"
                   :y-label "Distance Between Points"
                   :series-label "Minimum Distance"
                   :legend true)
        (c/add-lines (range 2 101) (map second distances)
                     :series-label "Maximum Distance...