Book Image

R High Performance Programming

Book Image

R High Performance Programming

Overview of this book

Table of Contents (17 chapters)
R High Performance Programming
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Using memory-mapped files and processing data in chunks


Some datasets are so large that even after applying all memory optimization techniques and using the most efficient data types possible, they are still too large to fit in or be processed in the memory. Short of getting additional RAM, one way to work with such large data is to store them on a disk in the form of memory-mapped files and load the data into the memory for processing one small chunk at a time.

For example, say we have a dataset that would require 100 GB of RAM if it is fully loaded into the memory and another 100 GB of free memory for the computations that need to be performed on the data. If the computer on which the data is to be processed only has 64 GB of RAM, we might divide the data into four chunks of 25 GB each. The R program will then load the data into the memory one chunk at a time and perform the necessary computations on each chunk. After all the chunks have been processed, the results from each chunk-wise...