Columnar databases provide good query performance for datasets that resemble R data frames, for example, most data from business IT systems. These datasets are usually two dimensional and can contain heterogeneous data types. On the other hand, scientific data sometimes contain homogeneous data types but are multidimensional. An example of this is weather readings in different points in time and space. For such applications, a new type of database called the array database provides even better query and scientific computing performance. One example of this is SciDB, available for download at http://www.scidb.org/. SciDB
provides a massively parallel processing (MPP) architecture that can perform queries in parallel on petabytes of array data. It supports in-database linear algebra, graph operations, linear models, correlations, and statistical tests. It also offers an R interface through the SciDB
package that is available...
R High Performance Programming
R High Performance Programming
Overview of this book
Table of Contents (17 chapters)
R High Performance Programming
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Free Chapter
Understanding R's Performance – Why Are R Programs Sometimes Slow?
Profiling – Measuring Code's Performance
Simple Tweaks to Make R Run Faster
Using Compiled Code for Greater Speed
Using GPUs to Run R Even Faster
Simple Tweaks to Use Less RAM
Processing Large Datasets with Limited RAM
Multiplying Performance with Parallel Computing
Offloading Data Processing to Database Systems
R and Big Data
Index
Customer Reviews