Book Image

Clojure for Data Science

By : Henry Garner
Book Image

Clojure for Data Science

By: Henry Garner

Overview of this book

Table of Contents (18 chapters)
Clojure for Data Science
Credits
About the Author
Acknowledgments
About the Reviewer
www.PacktPub.com
Preface
Index

Chapter 6. Clustering

 

Things that have a common quality ever quickly seek their kind.

 
 --Marcus Aurelius

In previous chapters, we covered multiple learning algorithms: linear and logistic regression, C4.5, naive Bayes, and random forests. In each case we were required to train the algorithm by providing features and a desired output. In linear regression, for example, the desired output was the weight of an Olympic swimmer, whereas for the other algorithms we provided a class: whether the passenger survived or perished. These are examples of supervised learning algorithms: we tell our algorithm the desired output and it will attempt to learn a model that reproduces it.

There is another class of learning algorithm referred to as unsupervised learning. Unsupervised algorithms are able to operate on the data without a set of reference answers. We may not even know ourselves what structure lies within the data; the algorithm will attempt to determine the structure for itself.

Clustering is an...