Book Image

Learning Apache Mahout

Book Image

Learning Apache Mahout

Overview of this book

Table of Contents (17 chapters)
Learning Apache Mahout
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Free Chapter
1
Introduction to Mahout
9
Case Study – Churn Analytics and Customer Segmentation
Index

Unsupervised learning


Unsupervised learning deals with unlabeled data. The objective is to observe structure in data and find patterns. Tasks like cluster analysis, association rule mining, outlier detection, dimensionality reduction, and so on can be modeled as unsupervised learning problems. As the tasks involved in unsupervised learning vary vastly, there is no single process outline that we can follow. We will follow the process of some of the most common unsupervised learning problems.

Cluster analysis

Cluster analysis is a subset of unsupervised learning that aims to create groups of similar items from a set of items. Real life examples could be clustering movies according to various attributes like genre, length, ratings, and so on. Cluster analysis helps us identify interesting groups of objects that we are interested in. It could be items we encounter in day-to-day life such as movies, songs according to taste, or interests of users in terms of their demography or purchasing patterns...