Book Image

Learning Data Mining with Python

Book Image

Learning Data Mining with Python

Overview of this book

Table of Contents (20 chapters)
Learning Data Mining with Python
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

The Apriori implementation


The goal of this chapter is to produce rules of the following form: if a person recommends these movies, they will also recommend this movie. We will also discuss extensions where a person recommends a set of movies is likely to recommend another particular movie.

To do this, we first need to determine if a person recommends a movie. We can do this by creating a new feature Favorable, which is True if the person gave a favorable review to a movie:

all_ratings["Favorable"] = all_ratings["Rating"] > 3

We can see the new feature by viewing the dataset:

all_ratings[10:15]
 

UserID

MovieID

Rating

Datetime

Favorable

10

62

257

2

1997-11-12 22:07:14

False

11

286

1014

5

1997-11-17 15:38:45

True

12

200

222

5

1997-10-05 09:05:40

True

13

210

40

3

1998-03-27 21:59:54

False

14

224

29

3

1998-02-21 23:40:57

False

We will sample our dataset to form a training dataset. This also helps reduce the size of the dataset that will be searched...