Learning Apache Mahout

Book Image

Learning Apache Mahout

Book Image

Learning Apache Mahout

Overview of this book

Learning Apache Mahout

Learning Apache Mahout

Credits

About the Author

About the Author

About the Reviewers

About the Reviewers

www.PacktPub.com

www.PacktPub.com

Preface

Free Chapter

Introduction to Mahout

Introduction to Mahout

Core Concepts in Machine Learning

Core Concepts in Machine Learning

Supervised learning

Unsupervised learning

Recommender system

Feature Engineering

Feature Engineering

Feature engineering

Classification with Mahout

Classification with Mahout

Logistic regression

Adaptive regression model

Code example with logistic regression

Naïve Bayes classifier

Frequent Pattern Mining and Topic Modeling

Frequent Pattern Mining and Topic Modeling

Frequent pattern mining

Importing the Mahout source code into Eclipse

Frequent pattern mining with Mahout

Recommendation with Mahout

Recommendation with Mahout

Collaborative filtering

Clustering with Mahout

Clustering with Mahout

Canopy clustering

A Mahout command-line example

A Mahout Java example

New Paradigm in Mahout

New Paradigm in Mahout

Moving beyond MapReduce

Spark Mahout basics

Linear regression with Mahout Spark

Case Study – Churn Analytics and Customer Segmentation

Case Study – Churn Analytics and Customer Segmentation

Churn analytics

Case Study – Text Analytics

Case Study – Text Analytics

Clustering text

Categorizing text

Index

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Frequent pattern mining

FP-Growth represents the frequent transactions in a consolidated data structure called FP Tree, and the frequent patterns are mined using the FP Tree.

There are two major steps while mining frequent patterns using the FP-Growth algorithm, building the FP Tree, and deriving frequent patterns from the FP Tree.

Building FP Tree

Let's assume a database with the following information. For each transaction, we have a list of items that were sold.

Transaction ID	Items
1	Fish, Milk, Egg, Bread, and Biscuit
2	Lemon, Fish, Bread, and Tea
3	Fish and Milk
4	Egg and Tea
5	Fish, Biscuit, Bread, and Cup

Let the minimum support be 2. We first compute the frequency of occurrence of each item in the transaction table. If you are not able to recall what is meant by support, please revisit the section Frequent pattern mining in Chapter 2, Core Concepts in Machine Learning.

The frequency of occurrence of items is as shown here:

Items	Frequency
Fish	4
Milk	2
Egg	2
Bread ...