Book Image

Learning Apache Mahout

Book Image

Learning Apache Mahout

Overview of this book

Table of Contents (17 chapters)
Learning Apache Mahout
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Free Chapter
1
Introduction to Mahout
9
Case Study – Churn Analytics and Customer Segmentation
Index

Chapter 5. Frequent Pattern Mining and Topic Modeling

In this chapter, we are going to discuss two important application areas of machine learning, frequent pattern mining and topic modeling. Frequent pattern mining helps identify frequent patterns among transactions. This type of technique is used widely in market basket analysis, upselling and cross-selling of products, and so on. There are many different algorithms to mine frequent patterns from databases such as Apriori, Tree projection, and FP-Growth; we will restrict our discussion to FP-Growth, which is implemented in Mahout. Topic modeling represents documents under consideration as topics. Each topic is a bag of words that we can use to label the topics. We will also discuss the Mahout implementation of Latent Dirichlet allocation (LDA). The topics covered in this chapter are as follows:

  • Frequent pattern mining

  • Topic modeling