Book Image

R High Performance Programming

Book Image

R High Performance Programming

Overview of this book

Table of Contents (17 chapters)
R High Performance Programming
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Using array databases for maximum scientific-computing performance


Columnar databases provide good query performance for datasets that resemble R data frames, for example, most data from business IT systems. These datasets are usually two dimensional and can contain heterogeneous data types. On the other hand, scientific data sometimes contain homogeneous data types but are multidimensional. An example of this is weather readings in different points in time and space. For such applications, a new type of database called the array database provides even better query and scientific computing performance. One example of this is SciDB, available for download at http://www.scidb.org/. SciDB provides a massively parallel processing (MPP) architecture that can perform queries in parallel on petabytes of array data. It supports in-database linear algebra, graph operations, linear models, correlations, and statistical tests. It also offers an R interface through the SciDB package that is available...