The rmr2
package allows you to perform big data processing and analysis via MapReduce on a Hadoop cluster. To perform MapReduce on a Hadoop cluster, you have to install R and rmr2
on every task node. In this recipe, we will illustrate how to install rmr2
on a single node of a Hadoop cluster.
Ensure that you have completed the previous recipe by starting the Cloudera QuickStart VM and connecting the VM to the internet, so that you can proceed with downloading and installing the rmr2
package.
Perform the following steps to install rmr2
on the QuickStart VM:
- First, open the terminal within the Cloudera QuickStart VM.
- Use the permission of the root to enter an R session:
$ sudo R
- You can then install dependent packages before installing
rmr2
:
> install.packages(c("codetools", "Rcpp", "RJSONIO", "bitops",
"digest", "functional", "stringr", "plyr", "reshape2", "rJava",
"caTools"))
- Quit the R session:
> q()
- Next, you can download
rmr-3.3.0...