The logic of plyr
is very similar to aggregate()
and apply()
. In fact, the title of the package is Tools for splitting, applying, and combining data.
Its main functions can be easily understood since they are all named for its input and output objects, based upon the following references:
d
: This is for a data framea
: This is for an arrayl
: This is for a listm
: This is for a data frame or an array (column-wise, only as input)_
: This is for a function's guess (only for outputs)
So, for instance, laply
receives a list and returns an array, ddply
receives a data frame and returns a data frame, and so on.
Although the package has a wide variety of functions available, all the ones that have a data frame as input are the most important ones (also, the ones starting with d). The following are a few of its usage:
ddply(iris,.(Species), summarize, indicator1=quantile(Sepal.Length,0.75), indicator2=sum(Sepal.Width)/sum(Petal.Length)) ## Species indicator1 indicator2 ## 1 setosa ...