Book Image

R Data Analysis Cookbook - Second Edition

By : Kuntal Ganguly, Shanthi Viswanathan, Viswa Viswanathan
Book Image

R Data Analysis Cookbook - Second Edition

By: Kuntal Ganguly, Shanthi Viswanathan, Viswa Viswanathan

Overview of this book

Data analytics with R has emerged as a very important focus for organizations of all kinds. R enables even those with only an intuitive grasp of the underlying concepts, without a deep mathematical background, to unleash powerful and detailed examinations of their data. This book will show you how you can put your data analysis skills in R to practical use, with recipes catering to the basic as well as advanced data analysis tasks. Right from acquiring your data and preparing it for analysis to the more complex data analysis techniques, the book will show you how you can implement each technique in the best possible manner. You will also visualize your data using the popular R packages like ggplot2 and gain hidden insights from it. Starting with implementing the basic data analysis concepts like handling your data to creating basic plots, you will master the more advanced data analysis techniques like performing cluster analysis, and generating effective analysis reports and visualizations. Throughout the book, you will get to know the common problems and obstacles you might encounter while implementing each of the data analysis techniques in R, with ways to overcoming them in the easiest possible way. By the end of this book, you will have all the knowledge you need to become an expert in data analysis with R, and put your skills to test in real-world scenarios.
Table of Contents (14 chapters)

Reading data from R files and R libraries

During data analysis, you will create several R objects. You can save these in the native R data format and retrieve them later as needed.

Getting ready

First, create and save the R objects interactively, as shown in the following code. Make sure you have write access to the R working directory.

> customer <- c("John", "Peter", "Jane") 
> orderdate <- as.Date(c('2014-10-1','2014-1-2','2014-7-6'))
> orderamount <- c(280, 100.50, 40.25)
> order <- data.frame(customer,orderdate,orderamount)
> names <- c("John", "Joan")
> save(order, names, file="test.Rdata")
> saveRDS(order,file="order.rds")
> remove(order)

After saving the preceding code, the remove() function deletes the object from the current session.

How to do it...

To be able to read data from R files and libraries, follow these steps:

  1. Load data from the R data files into memory:
> load("test.Rdata") 
> ord <- readRDS("order.rds")
  1. The datasets package is loaded in the R environment by default and contains the iris and cars datasets. To load these datasets data into memory, use the following code:
> data(iris) 
> data(list(cars,iris))

The first command loads only the iris dataset, and the second loads both the cars and iris datasets.

How it works...

The save() function saves the serialized version of the objects supplied as arguments along with the object name. The subsequent load() function restores the saved objects, with the same object names that they were saved with, to the global environment by default. If there are existing objects with the same names in that environment, they will be replaced without any warnings.

The saveRDS() function saves only one object. It saves the serialized version of the object and not the object name. Hence, with the readRDS() function, the saved object can be restored into a variable with a different name from when it was saved.

There's more...

The preceding recipe has shown you how to read saved R objects. We see more options in this section.

Saving all objects in a session

The following command can be used to save all objects:

> save.image(file = "all.RData") 

Saving objects selectively in a session

To save objects selectively, use the following commands:

> odd <- c(1,3,5,7) 
> even <- c(2,4,6,8)
> save(list=c("odd","even"),file="OddEven.Rdata")

The list argument specifies a character vector containing the names of the objects to be saved. Subsequently, loading data from the OddEven.Rdata file creates both odd and even objects. The saveRDS() function can save only one object at a time.

Attaching/detaching R data files to an environment

While loading Rdata files, if we want to be notified whether objects with the same names already exist in the environment, we can use:

> attach("order.Rdata") 

The order.Rdata file contains an object named order. If an object named order already exists in the environment, we will get the following error:

The following object is masked _by_ .GlobalEnv: 

order

Listing all datasets in loaded packages

All the loaded packages can be listed using the following command:

> data()