In the first chapter we introduced you to a number of general terms and concepts related to Big Data. In Chapter 2, Introduction to R Programming Language and Statistical Environment, we presented you with several frequently used methods for data management, processing, and analysis using the R language and its statistical environment. In this chapter we will merge both topics and attempt to explain how you can use powerful mathematical and data modeling R packages in large datasets, without the need for distributed computing. After reading this chapter you should be able to:
Understand R's traditional limitations for Big Data analytics and how they can be resolved
Use R packages such as
bigmemoryto enhance out-of-memory performance
Apply statistical methods to large R objects through the
Enhance the speed of data processing with R libraries supporting parallel computing
Benefit from faster data...