Those not familiar with how Hadoop works may often see Hadoop as a remedy for big data processing. Some might believe that Hadoop can return the processed results for any size of data within a few milliseconds. In this recipe, we will compare the performance between an R MapReduce program and a standard R program to demonstrate that Hadoop does not perform as quickly as some may believe.
In this recipe, you should have completed the previous recipe by installing rmr2
into the R environment.
Perform the following steps to compare the performance of a standard R program and an R MapReduce program:
- First, you can implement a standard R program to have all numbers squared:
> a.time = proc.time() > small.ints2=1:100000 > result.normal = sapply(small.ints2, function(x) x^2) > proc.time() - a.time
- To compare the performance, you can implement...