In this book, we covered many different aspects of parallelism, including R's built-in multicore capabilities with its parallel
package, message passing using the MPI standard, and parallelism based on General Purpose GPU (GPGPU) with OpenCL. We also explored different framework approaches to parallelism from load balancing, through task farming to spatial processing with grid layout and more general purpose batch data processing in the cloud using Hadoop through the segue
package as well as the hot new tech in cluster computing, Apache Spark, that is much better suited for real-time data processing at scale.
You should now have a broad coverage and understanding of these different approaches to parallelism, their particular suitability for different types of workload, how to deal with both balanced and unbalanced workloads to ensure maximum efficiency, and how to use the technologies that underpin them from R to exploit multiple cores on your PC/GPU using SPMD and SIMD vector processing...