Besides manipulating a dataset, the most important part of dplyr
is that we can easily obtain summary statistics from the data. In SQL operation, we can use the GROUP BY
function for this purpose, and it is possible to perform a similar operation in dplyr
. In this recipe, we will show you how to summarize data with dplyr
.
Ensure that you completed Enhancing a data.frame with a data.table recipe to load purchase_view.tab
and purchase_order.tab
as both data.frame
and data.table
into your R environment.
Perform the following steps to summarize data with dplyr
:
First, use the
summarize
andgroup_by
functions to obtain the total purchase amount of each product:> order.dt %>% + select(User, Price) %>% + group_by(User) %>% + summarise(sum(Price)) %>% + head() User sum(Price) 1 U312622727 1868 2 U239012343 5195 3 U10007697373 494 4 U296328517 1490 5 U300884570 249 6 U451050374...