While writing a MapReduce program with rmr2
is much easier than writing a native Java version, it is still hard for non-developers to write a MapReduce program. Therefore, you can use plyrmr
, a high-level abstraction of the MapReduce program, so that you can use plyr-like operations to manipulate big data. In this recipe, we will introduce some operations you can use to manipulate data.
In this recipe, you should have completed the previous recipes by installing plyrmr
and rmr2
in R.
Perform the following steps to manipulate data with plyrmr
:
- First, you need to load both
plyrmr
andrmr2
into R:
> library(rmr2)> library(plyrmr)
- You can then set the execution mode to the local mode:
> plyrmr.options(backend="local")
- Next, load the
Titanic
dataset into R:
> data(Titanic)> titanic = data.frame(Titanic)
- Begin the operation by filtering the data:
> where(+ Titanic, + Freq >=100)
- You can also use a pipe operator to filter the...