Book Image

R Data Analysis Cookbook

By : Viswa Viswanathan, Shanthi Viswanathan
Book Image

R Data Analysis Cookbook

By: Viswa Viswanathan, Shanthi Viswanathan

Overview of this book

<p>Data analytics with R has emerged as a very important focus for organizations of all kinds. R enables even those with only an intuitive grasp of the underlying concepts, without a deep mathematical background, to unleash powerful and detailed examinations of their data.</p> <p>This book empowers you by showing you ways to use R to generate professional analysis reports. It provides examples for various important analysis and machine-learning tasks that you can try out with associated and readily available data. The book also teaches you to quickly adapt the example code for your own needs and save yourself the time needed to construct code from scratch.</p>
Table of Contents (18 chapters)
R Data Analysis Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Reading data from R files and R libraries


During data analysis, you will create several R objects. You can save these in the native R data format and retrieve them later as needed.

Getting ready

First, create and save R objects interactively as shown in the following code. Make sure you have write access to the R working directory:

> customer <- c("John", "Peter", "Jane")
> orderdate <- as.Date(c('2014-10-1','2014-1-2','2014-7-6'))
> orderamount <- c(280, 100.50, 40.25)
> order <- data.frame(customer,orderdate,orderamount)
> names <- c("John", "Joan")
> save(order, names, file="test.Rdata")
> saveRDS(order,file="order.rds")
> remove(order)

After saving the preceding code, the remove() function deletes the object from the current session.

How to do it...

To be able to read data from R files and libraries, follow these steps:

  1. Load data from R data files into memory:

    > load("test.Rdata")
    > ord <- readRDS("order.rds")
  2. The datasets package is loaded in the R environment by default and contains the iris and cars datasets. To load these datasets' data into memory, use the following code:

    > data(iris)
    > data(c(cars,iris))

The first command loads only the iris dataset, and the second loads the cars and iris datasets.

How it works...

The save() function saves the serialized version of the objects supplied as arguments along with the object name. The subsequent load() function restores the saved objects with the same object names they were saved with, to the global environment by default. If there are existing objects with the same names in that environment, they will be replaced without any warnings.

The saveRDS() function saves only one object. It saves the serialized version of the object and not the object name. Hence, with the readRDS() function the saved object can be restored into a variable with a different name from when it was saved.

There's more...

The preceding recipe has shown you how to read saved R objects. We see more options in this section.

To save all objects in a session

The following command can be used to save all objects:

> save.image(file = "all.RData")

To selectively save objects in a session

To save objects selectively use the following commands:

> odd <- c(1,3,5,7)
> even <- c(2,4,6,8)
> save(list=c("odd","even"),file="OddEven.Rdata")

The list argument specifies a character vector containing the names of the objects to be saved. Subsequently, loading data from the OddEven.Rdata file creates both odd and even objects. The saveRDS() function can save only one object at a time.

Attaching/detaching R data files to an environment

While loading Rdata files, if we want to be notified whether objects with the same name already exist in the environment, we can use:

> attach("order.Rdata")

The order.Rdata file contains an object named order. If an object named order already exists in the environment, we will get the following error:

The following object is masked _by_ .GlobalEnv:

    order

Listing all datasets in loaded packages

All the loaded packages can be listed using the following command:

> data()