Book Image

Building a Recommendation System with R

Book Image

Building a Recommendation System with R

Overview of this book

Table of Contents (13 chapters)
Building a Recommendation System with R
About the Authors
About the Reviewer

Preparing the data

Starting from raw data, this section will show you how to prepare the input for the recommendation models.

Description of the data

The data is about Microsoft users visiting a website during one week. For each user, the data displays which areas the users visited. For the sake of simplicity, from now on we will refer to the website areas with the term "items".

There are 5,000 users and they are represented by sequential numbers between 10,001 and 15,000. Items are represented by numbers between 1,000 and 1,297, even if they are less than 298.

The dataset is an unstructured text file. Each record contains a number of fields between 2 and 6. The first field is a letter defining what the record contains. There are three main types of records, which are as follows:

  • Attribute (A): This is the description of the website area

  • Case (C): This is the case for each user, containing its ID

  • Vote (V): This is the vote lines for the case

Each case record is followed by one or more votes, and...