Book Image

R High Performance Programming

Book Image

R High Performance Programming

Overview of this book

Table of Contents (17 chapters)
R High Performance Programming
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Optimizing parallel performance


Throughout the examples in this chapter, we saw various factors that affect the performance of parallel code.

One overhead in running a parallel R code is in setting up the cluster. By default, makeCluster() instructs the worker processes to load the methods package when they start. This can take a good amount of time, so if the task to be run does not require methods, this behavior can be disabled by passing methods=FALSE to makeCluster().

One of the biggest obstacles to parallel performance is the copying and transmission of data between the master process and the worker process. This obstacle can be large when you run parallel tasks on a cluster of computers, as many factors such as limited network bandwidth, and data encryption slow down the transmission of data even before any computations can be done. Even on a single computer, unnecessary copying of data in memory takes up precious seconds that can multiply as the data grows. This can also happen the...