Book Image

Applied Data Visualization with R and ggplot2

By : Dr. Tania Moulik
Book Image

Applied Data Visualization with R and ggplot2

By: Dr. Tania Moulik

Overview of this book

Applied Data Visualization with R and ggplot2 introduces you to the world of data visualization by taking you through the basic features of ggplot2. To start with, you’ll learn how to set up the R environment, followed by getting insights into the grammar of graphics and geometric objects before you explore the plotting techniques. You’ll discover what layers, scales, coordinates, and themes are, and study how you can use them to transform your data into aesthetical graphs. Once you’ve grasped the basics, you’ll move on to studying simple plots such as histograms and advanced plots such as superimposing and density plots. You’ll also get to grips with plotting trends, correlations, and statistical summaries. By the end of this book, you’ll have created data visualizations that will impress your clients.
Table of Contents (10 chapters)

Geoms and Statistical Summaries


Sometimes, you will need to calculate statistical summaries, such as the mean, median, or a quartile of a variable, and view changes with respect to another variable. This can be done by using grouping commands.

Let's plot Genre versus AudienceScore for the HollywoodMovies dataset. Change the angle of the axis labeling text, in order to make it less cluttered, using the following command:

ggplot(HollywoodMovies,aes(Genre,AudienceScore))+geom_point()+theme(axis.text.x=element_text(angle=40))

You'll get the following output:

Using Grouping to Create a Summarized Plot

In this section, we'll use grouping to summarize multiple y values for a given x value. Let's begin by implementing the following steps:

  1. Use grouping to group by genre and remove NULL values:
gp_scr <- group_by(HollywoodMovies,Genre)
gp_scr <- na.omit(gp_scr)
  1. Calculate the mean and standard deviation using the summarise function and make a new dataset, as follows:
dfnew <- dplyr::summarise(gp_scr...