Book Image

Learning Data Mining with Python

Book Image

Learning Data Mining with Python

Overview of this book

Table of Contents (20 chapters)
Learning Data Mining with Python
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Chapter 4 – Recommending Movies Using Affinity Analysis


New datasets

http://www2.informatik.uni-freiburg.de/~cziegler/BX/

There are many recommendation-based datasets that are worth investigating, each with its own issues. For example, the Book-Crossing dataset contains more than 278,000 users and over a million ratings. Some of these ratings are explicit (the user did give a rating), while others are more implicit. The weighting to these implicit ratings probably shouldn't be as high as for explicit ratings.

The music website www.last.fm has released a great dataset for music recommendation: http://www.dtic.upf.edu/~ocelma/MusicRecommendationDataset/.

There is also a joke recommendation dataset! See here: http://eigentaste.berkeley.edu/dataset/.

The Eclat algorithm

http://www.borgelt.net/eclat.html

The APriori algorithm implemented in this chapter is easily the most famous of the association rule mining graphs, but isn't necessarily the best. Eclat is a more modern algorithm that can be implemented relatively easily.