RStudio for R Statistical Computing Cookbook

RStudio for R Statistical Computing Cookbook

By : Andrea Cirillo

Buy this Book

RStudio for R Statistical Computing Cookbook

By: Andrea Cirillo

Buy this Book

Overview of this book

The requirement of handling complex datasets, performing unprecedented statistical analysis, and providing real-time visualizations to businesses has concerned statisticians and analysts across the globe. RStudio is a useful and powerful tool for statistical analysis that harnesses the power of R for computational statistics, visualization, and data science, in an integrated development environment. This book is a collection of recipes that will help you learn and understand RStudio features so that you can effectively perform statistical analysis and reporting, code editing, and R development. The first few chapters will teach you how to set up your own data analysis project in RStudio, acquire data from different data sources, and manipulate and clean data for analysis and visualization purposes. You'll get hands-on with various data visualization methods using ggplot2, and you will create interactive and multidimensional visualizations with D3.js. Additional recipes will help you optimize your code; implement various statistical models to manage large datasets; perform text analysis and predictive analysis; and master time series analysis, machine learning, forecasting; and so on. In the final few chapters, you'll learn how to create reports from your analytical application with the full range of static and dynamic reporting tools that are available in RStudio so that you can effectively communicate results and even transform them into interactive web applications.

RStudio for R Statistical Computing Cookbook

Credits

About the Author

About the Reviewer

www.PacktPub.com

Preface

Free Chapter

Acquiring Data for Your Project

Introduction

Acquiring data from the Web – web scraping tasks

Accessing an API with R

Getting data from Twitter with the twitteR package

Getting data from Facebook with the Rfacebook package

Getting data from Google Analytics

Loading your data into R with rio packages

Converting file formats using the rio package

Preparing for Analysis – Data Cleansing and Manipulation

Introduction

Getting a sense of your data structure with R

Preparing your data for analysis with the tidyr package

Detecting and removing missing values

Substituting missing values using the mice package

Detecting and removing outliers

Performing data filtering activities

Basic Visualization Techniques

Introduction

Looking at your data using the plot() function

Using pairs.panel() to look at (visualize) correlations between variables

Adding text to a ggplot2 plot at a custom location

Changing axes appearance to ggplot2 plot (continous axes)

Producing a matrix of graphs with ggplot2

Drawing a route on a map with ggmap

Making use of the igraph package to draw a network

Showing communities in a network with the linkcomm package

Advanced and Interactive Visualization

Introduction

Producing a Sankey diagram with the networkD3 package

Creating a dynamic force network with the visNetwork package

Building a rotating 3D graph and exporting it as a GIF

Using the DiagrammeR package to produce a process flow diagram in RStudio

Power Programming with R

Introduction

Writing modular code in RStudio

Implementing parallel computation in R

Creating custom objects and methods in R using the S3 system

Evaluating your code performance using the profvis package

Comparing an alternative function's performance using the microbenchmarking package

Using GitHub with RStudio

Domain-specific Applications

Introduction

Dealing with regular expressions

Analyzing PDF reports in a folder with the tm package

Creating word clouds with the wordcloud package

Performing a Twitter sentiment analysis

Detecting fraud in e-commerce orders with Benford's law

Measuring customer retention using cohort analysis in R

Making a recommendation engine

Performing time series decomposition using the stl() function

Exploring time series forecasting with forecast()

Tracking stock movements using the quantmod package

Optimizing portfolio composition and maximising returns with the Portfolio Analytics package

Forecasting the stock market

Developing Static Reports

Introduction

Using one markup language for all types of documents – rmarkdown

Writing and styling PDF documents with RStudio

Writing wonderful tufte handouts with the tufte package and rmarkdown

Sharing your code and plots with slides

Curating a blog through RStudio

Dynamic Reporting and Web Application Development

Introduction

Generating dynamic parametrized reports with R Markdown

Developing a single-file Shiny app

Changing a Shiny app UI based on user input

Creating an interactive report with Shiny

Constructing RStudio add-ins

Sharing your work on RPubs

Deploying your app on Amazon AWS with ramazon

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Preparing your data for analysis with the tidyr package

The tidyr package is another gift from Hadley Wickham. This package provides functions to make your data tidy.

This means that after applying the tidyr package's function, your data you will be arranged as per the following rules:

Each column will contain an attribute
Each row will contain an observation
Each cell will contain a value

These rules will produce a dataset similar to the following one:

This structure, besides giving you a clearer understanding of your data, will let you work with it more easily.

Furthermore, this structure will let you take full advantage of the inner R-vectorized structure. This recipe will show you how to apply the gather function to a dataset in order to transform a dataset and make it comply with the cited rules.

The employed data frame is in the so-called wide format, where each period of observation is stored in columns, with each column representing a year, as follows:

Getting ready

In order to let you apply...

RStudio for R Statistical Computing Cookbook

By : Andrea Cirillo

RStudio for R Statistical Computing Cookbook

By: Andrea Cirillo

Overview of this book

Related Content you might be interested in

Current Title:

RStudio for R Statistical Computing Cookbook

Preparing your data for analysis with the tidyr package

Getting ready