Book Image

Hands-On Recommendation Systems with Python

By : Rounak Banik

Book Image

Hands-On Recommendation Systems with Python

By: Rounak Banik

Overview of this book

Recommendation systems are at the heart of almost every internet business today; from Facebook to Net?ix to Amazon. Providing good recommendations, whether it's friends, movies, or groceries, goes a long way in defining user experience and enticing your customers to use your platform. This book shows you how to do just that. You will learn about the different kinds of recommenders used in the industry and see how to build them from scratch using Python. No need to wade through tons of machine learning theory—you'll get started with building and learning about recommenders as quickly as possible.. In this book, you will build an IMDB Top 250 clone, a content-based engine that works on movie metadata. You'll use collaborative filters to make use of customer behavior data, and a Hybrid Recommender that incorporates content based and collaborative filtering techniques With this book, all you need to get started with building recommendation systems is a familiarity with Python, and by the time you're fnished, you will have a great grasp of how recommenders work and be in a strong position to apply the techniques that you will learn to your own problem domains.

Preface

Who this book is for

What this book covers

To get the most out of this book

Free Chapter

Getting Started with Recommender Systems

Getting Started with Recommender Systems

Technical requirements

What is a recommender system?

Types of recommender systems

Manipulating Data with the Pandas Library

Manipulating Data with the Pandas Library

Technical requirements

Setting up the environment

The Pandas library

The Pandas DataFrame

The Pandas Series

Building an IMDB Top 250 Clone with Pandas

Building an IMDB Top 250 Clone with Pandas

Technical requirements

The simple recommender

The knowledge-based recommender

Building Content-Based Recommenders

Building Content-Based Recommenders

Technical requirements

Exporting the clean DataFrame

Document vectors

The cosine similarity score

Plot description-based recommender

Metadata-based recommender

Suggestions for improvements

Getting Started with Data Mining Techniques

Getting Started with Data Mining Techniques

Problem statement

Similarity measures

Dimensionality reduction

Supervised learning

Evaluation metrics

Building Collaborative Filters

Building Collaborative Filters

Technical requirements

User-based collaborative filtering

Item-based collaborative filtering

Model-based approaches

Hybrid Recommenders

Hybrid Recommenders

Technical requirements

Case study – Building a hybrid model

Other Books You May Enjoy

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

What is a recommender system?

Recommender systems are pretty self-explanatory; as the name suggests, they are systems or techniques that recommend or suggest a particular product, service, or entity. However, these systems can be classified into the following two categories, based on their approach to providing recommendations.

The prediction problem

In this version of the problem, we are given a matrix of m users and n items. Each row of the matrix represents a user and each column represents an item. The value of the cell in the i^throw and the j^th column denotes the rating given by user i to item j. This value is usually denoted as r_ij.

For instance, consider the matrix in the following screenshot:

This matrix has seven users rating six items. Therefore, m = 7 and n = 6. User 1 has given the item 1 a rating of 4. Therefore, r₁₁ = 4.

Let us now consider a more concrete example. Imagine you are Netflix and you have a repository of 20,000 movies and 5,000 users. You have a system in place that records every rating that each user gives to a particular movie. In other words, you have the rating matrix (of shape 5,000 × 20,000) with you.

However, all your users will have seen only a fraction of the movies you have available on your site; therefore, the matrix you have is sparse. In other words, most of the entries in your matrix are empty, as most users have not rated most of your movies.

The prediction problem, therefore, aims to predict these missing values using all the information it has at its disposal (the ratings recorded, data on movies, data on users, and so on). If it is able to predict the missing values accurately, it will be able to give great recommendations. For example, if user i has not used item j, but our system predicts a very high rating (denoted by _ij), it is highly likely that i will love j should they discover it through the system.

The ranking problem

Ranking is the more intuitive formulation of the recommendation problem. Given a set of n items, the ranking problem tries to discern the top k items to recommend to a particular user, utilizing all of the information at its disposal.

Imagine you are Airbnb, much like the preceding example. Your user has input the specific things they are looking for in their host and the space (such as their location, and budget). You want to display the top 10 results that satisfy those aforementioned conditions. This would be an example of the ranking problem.

It is easy to see that the prediction problem often boils down to the ranking problem. If we are able to predict missing values, we can extract the top values and display them as our results.

In this book, we will look at both formulations and build systems that effectively solve them.