Chapter 7. Recommender Systems
 | "People who like this sort of thing will find this the sort of thing they like." |  |
 | --attributed to Abraham Lincoln |
In the previous chapter, we performed clustering on text documents using the k-means algorithm. This required us to have a measure of similarity between the text documents to be clustered. In this chapter, we'll be investigating recommender systems and we'll use this notion of similarity to suggest items that we think users might like.
We also saw the challenge presented by high-dimensional data—the so-called curse of dimensionality. Although it's not a problem specific to recommender systems, this chapter will show a variety of techniques that tackle its effects. In particular, we'll look at the means of establishing the most important dimensions with principle component analysis and singular value decomposition, and probabilistic ways of compressing very high dimensional sets with Bloom filters and MinHash. In addition—because determining the similarity...