In this chapter, we discussed the importance of the split-apply-combine strategy. We understood what the split-apply-combine strategy is and why it is important in data manipulations. The split-apply-combine strategy can be implemented using base R, but it requires a large amount of code and is not memory or time efficient. To overcome this limitation, we discussed the plyr
package in which group-wise data manipulation can be implemented efficiently. The functions within plyr
are intuitive and instructive in terms of input and output types. A large variety of data processing can be done using only a few functions with common input and various types of output. For further reading, an interested user can refer to the paper The Split-Apply-Combine Strategy for Data Analysis by Wickham, which can be found at http://www.jstatsoft.org/v40/i01/paper. We also discussed how we can use dplyr
as a powerful tool to manipulate data frame.
In the following chapter, you will learn about reshaping...