Book Image

Clojure for Data Science

By : Henry Garner
Book Image

Clojure for Data Science

By: Henry Garner

Overview of this book

Table of Contents (18 chapters)
Clojure for Data Science
Credits
About the Author
Acknowledgments
About the Reviewer
www.PacktPub.com
Preface
Index

Chapter 7. Recommender Systems

 

"People who like this sort of thing will find this the sort of thing they like."

 
 --attributed to Abraham Lincoln

In the previous chapter, we performed clustering on text documents using the k-means algorithm. This required us to have a measure of similarity between the text documents to be clustered. In this chapter, we'll be investigating recommender systems and we'll use this notion of similarity to suggest items that we think users might like.

We also saw the challenge presented by high-dimensional data—the so-called curse of dimensionality. Although it's not a problem specific to recommender systems, this chapter will show a variety of techniques that tackle its effects. In particular, we'll look at the means of establishing the most important dimensions with principle component analysis and singular value decomposition, and probabilistic ways of compressing very high dimensional sets with Bloom filters and MinHash. In addition—because determining the similarity...