Though the primary objective of this book is to build recommender systems, a walkthrough of the commonly used data-mining techniques is a necessary step before jumping into building recommender systems. In this chapter, you will learn about popular data preprocessing techniques, data-mining techniques, and data-evaluation techniques commonly used in recommender systems. The first section of the chapter tells you how a data analysis problem is solved, followed by data preprocessing steps such as similarity measures and dimensionality reduction. The next section of the chapter deals with data mining techniques and their evaluation techniques.

Similarity measures include:

Euclidean distance

Cosine distance

Pearson correlation

Dimensionality reduction techniques include:

Principal component analysis

Data-mining techniques include:

k-means clustering

Support vector machine

Ensemble methods, such as bagging, boosting, and random forests