Understanding the recommender systems
Recommender systems or recommendation engines are a popular class of machine learning algorithms widely used today by online retail companies. With historical data about users and product interactions, a recommender system can make profitable/useful recommendations about users and their product preferences.
In the last decade, recommender systems have achieved great success with both online retailers and brick and mortar stores. They have allowed retailers to move away from group campaigns, where a group of people receive a single offer. Recommender systems technology has revolutionized marketing campaigns. Today, retailers offer a customized recommendation to each of their customers. Such recommendations can dramatically increase customer stickiness.
Retailers design and run sales campaigns to promote up-selling and cross-selling. Up-selling is a technique by which retailers try to push high-value products to their customers. Cross-selling is the practice of selling additional products to customers. Recommender systems provide an empirical method to generate recommendations for retailers up-selling and cross-selling campaigns.
Retailers can now make quantitative decisions based on solid statistics and math to improve their businesses. There are a growing number of conferences and journals dedicated to recommender systems technology, which plays a vital role today at top successful companies such as Amazon.com, YouTube, Netflix, LinkedIn, Facebook, TripAdvisor, and IMDb.
Based on the type and volume of available data, recommender systems of varying complexity and increased accuracy can be built. In the previous paragraph, we defined historical data as a user and his product interactions. Let's use this definition to illustrate the different types of data in the context of recommender systems.
Transactions
Transactions are purchases made by a customer on a single visit to a retail store. Typically, transaction data can include the products purchased, quantity purchased, the price, discount if applied, and a timestamp. A single transaction can include multiple products. It may register information about the user who made the transaction in some cases, where the customer allows the retailer to store his information by joining a rewards program.
A simplified view of the transaction data is a binary matrix. The rows of this matrix correspond to a unique identifier for a transaction; let's call it transaction ID. The columns correspond to the unique identifier for a product; let's call it product ID. The cell values are zero or one, if that product is either excluded or included in the transaction.
A binary matrix with n transactions and m products is given as follows:
Txn/Product | P1 | P2 | P3 | .... | Pm |
T1 | 0 | 1 | 1 | ... | 0 |
T2 | 1 | 1 | 1 | .... | 1 |
... | ... | ... | ... | ... | ... |
Tn | o | 1 | 1 | ... | 1 |
Weighted transactions
This is additional information added to the transaction to denote its importance, such as the profitability of the transaction as a whole or the profitability of the individual products in the transaction. In the case of the preceding binary matrix, a column called weight is added to store the importance of the transaction.
In this chapter, we will show you how to use transaction data to support cross-selling campaigns. We will see how the derived user product preferences, or recommendations from the user's product interactions (transactions/weighted transactions), can fuel successful cross-selling campaigns. We will implement and understand the algorithms that can leverage this data in R. We will work on a superficial use case in which we need to generate recommendations to support a cross-selling campaign for an imaginative retailer.
Our web application
Our goal, by the the end of this chapter, is to understand the concepts of association rule mining and related topics, and solve the given cross-selling campaign problem using association rule mining. We will understand how different aspects of the cross-selling campaign problem can be solved using the family of association rule mining algorithms, how to implement them in R, and finally build the web application to display our analysis and results.
We will be following a code-first approach in this book. The style followed throughout this book is to introduce a real-world problem, following which we will briefly introduce the algorithm/technique that can be used to solve this problem. We will keep the algorithm description brief. We will proceed to introduce the R package that implements the algorithm and subsequently start writing the R code to prepare the data in a way that the algorithm expects. As we invoke the actual algorithm in R and explore the results, we will get into the nitty-gritty of the algorithm. Finally, we will provide further references for curious readers.