We will start off by introducing you to three very useful and versatile packages which facilitate out-of-memory data processing:
ff package authored by Adler, Glaser, Nenadic, Ochlschlagel, and Zucchini, is several years old it still proves to be a popular solution to large data processing with R. The title of the package Memory-efficient storage of large data on disk and fast access functions roughly explains what it does. It chunks the dataset, and stores it on a hard drive, while the
ff data structure (or
ffdf data frame), which is held in RAM, like the other R data structures, provides mapping to the partitioned dataset. The chunks of raw data are simply binary flat files in native encoding, whereas the
ff objects keep the metadata, which describe and link to the created binary files. Creating
ff structures and binary files from the raw data does not...