Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying R Data Analysis Projects
  • Table Of Contents Toc
R Data Analysis Projects

R Data Analysis Projects

By : Gopi Subramanian
5 (2)
close
close
R Data Analysis Projects

R Data Analysis Projects

5 (2)
By: Gopi Subramanian

Overview of this book

R offers a large variety of packages and libraries for fast and accurate data analysis and visualization. As a result, it’s one of the most popularly used languages by data scientists and analysts, or anyone who wants to perform data analysis. This book will demonstrate how you can put to use your existing knowledge of data analysis in R to build highly efficient, end-to-end data analysis pipelines without any hassle. You’ll start by building a content-based recommendation system, followed by building a project on sentiment analysis with tweets. You’ll implement time-series modeling for anomaly detection, and understand cluster analysis of streaming data. You’ll work through projects on performing efficient market data research, building recommendation systems, and analyzing networks accurately, all provided with easy to follow codes. With the help of these real-world projects, you’ll get a better understanding of the challenges faced when building data analysis pipelines, and see how you can overcome them without compromising on the efficiency or accuracy of your systems. The book covers some popularly used R packages such as dplyr, ggplot2, RShiny, and others, and includes tips on using them effectively. By the end of this book, you’ll have a better understanding of data analysis with R, and be able to put your knowledge to practical use without any hassle.
Table of Contents (9 chapters)
close
close

Wrapping up

The final step in any data analysis project is documentation—either generating a report of the findings or documenting the scripts and data used. In our case, we are going to wrap up with a small application. We will use RShiny, an R web application framework. RShiny is a powerful framework for developing interactive web applications using R. We will leverage the code that we have written to generate a simple, yet powerful, user interface for our retail customers.

To keep things simple, we have a set of three screens. The first screen, as shown in the following screenshot, allows the user to vary support and confidence thresholds and view the rules generated. It also has additional interest measures, lift, conviction, and leverage. The user can sort the rule by any of these interest measures:

Another screen is a scatter plot representation of the rules:

Finally, a graph representation to view the product grouping for the easy selection of products for the cross-selling campaign is as follows:

The complete source code is available in <../App.R>:

########################################################################
#
# R Data Analysis Projects
#
# Chapter 1
#
# Building Recommender System
# A step step approach to build Association Rule Mining
#
# Script:
#
# Rshiny app
#
# Gopi Subramanian
#########################################################################
library(shiny)
library(plotly)
library(arules)
library(igraph)
library(arulesViz)
get.txn <- function(data.path, columns){
# Get transaction object for a given data file
#
# Args:
# data.path: data file name location
# columns: transaction id and item id columns.
#
# Returns:
# transaction object
transactions.obj <- read.transactions(file = data.path, format = "single",
sep = ",",
cols = columns,
rm.duplicates = FALSE,
quote = "", skip = 0,
encoding = "unknown")
return(transactions.obj)
}
get.rules <- function(support, confidence, transactions){
# Get Apriori rules for given support and confidence values
#
# Args:
# support: support parameter
# confidence: confidence parameter
#
# Returns:
# rules object
parameters = list(
support = support,
confidence = confidence,
minlen = 2, # Minimal number of items per item set
maxlen = 10, # Maximal number of items per item set
target = "rules"

)

rules <- apriori(transactions, parameter = parameters)
return(rules)
}
find.rules <- function(transactions, support, confidence, topN = 10){
# Generate and prune the rules for given support confidence value
#
# Args:
# transactions: Transaction object, list of transactions
# support: Minimum support threshold
# confidence: Minimum confidence threshold
# Returns:
# A data frame with the best set of rules and their support and confidence values


# Get rules for given combination of support and confidence
all.rules <- get.rules(support, confidence, transactions)

rules.df <-data.frame(rules = labels(all.rules)
, all.rules@quality)

other.im <- interestMeasure(all.rules, transactions = transactions)

rules.df <- cbind(rules.df, other.im[,c('conviction','leverage')])


# Keep the best rule based on the interest measure
best.rules.df <- head(rules.df[order(-rules.df$leverage),],topN)

return(best.rules.df)
}
plot.graph <- function(cross.sell.rules){
# Plot the associated items as graph
#
# Args:
# cross.sell.rules: Set of final rules recommended
# Returns:
# None
edges <- unlist(lapply(cross.sell.rules['rules'], strsplit, split='=>'))
g <- graph(edges = edges)
return(g)

}
columns <- c("order_id", "product_id") ## columns of interest in data file
data.path = '../../data/data.csv' ## Path to data file
transactions.obj <- get.txn(data.path, columns) ## create txn object
server <- function(input, output) {
cross.sell.rules <- reactive({
support <- input$Support
confidence <- input$Confidence
cross.sell.rules <- find.rules( transactions.obj, support, confidence )
cross.sell.rules$rules <- as.character(cross.sell.rules$rules)
return(cross.sell.rules)

})

gen.rules <- reactive({
support <- input$Support
confidence <- input$Confidence
gen.rules <- get.rules( support, confidence ,transactions.obj)
return(gen.rules)

})


output$rulesTable <- DT::renderDataTable({
cross.sell.rules()
})

output$graphPlot <- renderPlot({
g <-plot.graph(cross.sell.rules())
plot(g)
})

output$explorePlot <- renderPlot({
plot(x = gen.rules(), method = NULL,
measure = "support",
shading = "lift", interactive = FALSE)
})


}
ui <- fluidPage(
headerPanel(title = "X-Sell Recommendations"),
sidebarLayout(
sidebarPanel(
sliderInput("Support", "Support threshold:", min = 0.01, max = 1.0, value = 0.01),
sliderInput("Confidence", "Support threshold:", min = 0.05, max = 1.0, value = 0.05)

),
mainPanel(
tabsetPanel(
id = 'xsell',
tabPanel('Rules', DT::dataTableOutput('rulesTable')),
tabPanel('Explore', plotOutput('explorePlot')),
tabPanel('Item Groups', plotOutput('graphPlot'))
)
)
)
)
shinyApp(ui = ui, server = server)

We have described the get.txn, get.rules, and find.rules functions in the previous section. We will not go through them again here. The preceding code is a single page RShiny app code; both the server and the UI component reside in the same file.

The UI component is as follows:

 ui <- fluidPage(
headerPanel(title = "X-Sell Recommendations"),
sidebarLayout(
sidebarPanel(
sliderInput("Support", "Support threshold:", min = 0.01, max = 1.0, value = 0.01),
sliderInput("Confidence", "Support threshold:", min = 0.05, max = 1.0, value = 0.05)

),
mainPanel(
tabsetPanel(
id = 'xsell',
tabPanel('Rules', DT::dataTableOutput('rulesTable')),
tabPanel('Explore', plotOutput('explorePlot')),
tabPanel('Item Groups', plotOutput('graphPlot'))
)
)
)
)

We define the screen layout in this section. This section can also be kept in a separate file called UI.R. The page is defined by two sections, a panel in the left, defined by sidebarPanel, and a main section defined under mainPanel. Inside the side bar, we have defined two slider controls for the support and confidence thresholds respectively. The main panel contains a tab-separated window, defined by tabPanel.

The main panel has three tabs; each tab has a slot defined for the final set of rules, with their interest measures, a scatter plot for the rules, and finally the graph plot of the rules.

The server component is as follows:

server <- function(input, output) {
cross.sell.rules <- reactive({
support <- input$Support
confidence <- input$Confidence
cross.sell.rules <- find.rules( transactions.obj, support, confidence )
cross.sell.rules$rules <- as.character(cross.sell.rules$rules)
return(cross.sell.rules)

})

The cross.sell.rules data frame is defined as a reactive component. When the values of the support and confidence thresholds change in the UI, cross.sell.rules data frame will be recomputed. This frame will be served to the first page, where we have defined a slot for this table, called rulesTable:

 gen.rules <- reactive({
support <- input$Support
confidence <- input$Confidence
gen.rules <- get.rules( support, confidence ,transactions.obj)
return(gen.rules)
})

This reactive component retrieves the calculations and returns the rules object every time the support or/and confidence threshold is changed by the user in the UI:

 output$rulesTable <- DT::renderDataTable({
cross.sell.rules()
})

The preceding code renders the data frame back to the UI:

 output$graphPlot <- renderPlot({
g <-plot.graph(cross.sell.rules())
plot(g)
})

output$explorePlot <- renderPlot({
plot(x = gen.rules(), method = NULL,
measure = "support",
shading = "lift", interactive = FALSE)
})


}

The preceding two pieces of code render the plot back to the UI.

CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
R Data Analysis Projects
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist download Download options font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon