Apache Spark Graph Processing

We are now ready to apply the previous graph clustering method to the cluster music songs, according to the tags attached to each song. Alternatively, a dataset of the song playlists can also be used to cluster songs that are often played in many lists. The datasets that we are going to work with can be downloaded from http://www.cs.cornell.edu/~shuochen/lme/data_page.html. The datasets consist of the following files:

train.txt: This file contains the playlist data by using the integer ID to represent songs
tags.txt: This file includes the social tags by using the integer ID to represent songs
song_hash.txt: This file maps a song ID to its title and artist
tag_hash.txt: This one maps a tag ID to its name

Each file has a particular format as explained here:

Format of the playlist data: The first line of the data file consists of the IDs (not the integer ID, but the IDs from other sources for identifying the songs) for the songs, separated by a space...

Apache Spark Graph Processing

Apache Spark Graph Processing

Overview of this book

Related Content you might be interested in

Current Title:

Apache Spark Graph Processing

Applications – music fan community detection