-
Book Overview & Buying
-
Table Of Contents
R High Performance Programming
By :
Several R packages allow code to be executed in parallel. The parallel package that comes with R provides the foundation for most parallel computing capabilities in other packages. Let's see how it works with an example.
This example involves finding documents that match a regular expression. Regular expression matching is a fairly computational expensive task, depending on the complexity of the regular expression. The corpus, or set of documents, for this example is a sample of the Reuters-21578 dataset for the topic corporate acquisitions (acq) from the tm package. Because this dataset contains only 50 documents, they are replicated 100,000 times to form a corpus of 5 million documents so that parallelizing the code will lead to meaningful savings in execution times.
library(tm)
data("acq")
textdata <- rep(sapply(content(acq), content), 1e5)The task is to find documents that match the regular expression \d+(,\d+)? mln dlrs, which represents monetary...
Change the font size
Change margin width
Change background colour