Book Image

R Data Visualization Recipes

By : Vitor Bianchi Lanzetta
Book Image

R Data Visualization Recipes

By: Vitor Bianchi Lanzetta

Overview of this book

R is an open source language for data analysis and graphics that allows users to load various packages for effective and better data interpretation. Its popularity has soared in recent years because of its powerful capabilities when it comes to turning different kinds of data into intuitive visualization solutions. This book is an update to our earlier R data visualization cookbook with 100 percent fresh content and covering all the cutting edge R data visualization tools. This book is packed with practical recipes, designed to provide you with all the guidance needed to get to grips with data visualization using R. It starts off with the basics of ggplot2, ggvis, and plotly visualization packages, along with an introduction to creating maps and customizing them, before progressively taking you through various ggplot2 extensions, such as ggforce, ggrepel, and gganimate. Using real-world datasets, you will analyze and visualize your data as histograms, bar graphs, and scatterplots, and customize your plots with various themes and coloring options. The book also covers advanced visualization aspects such as creating interactive dashboards using Shiny By the end of the book, you will be equipped with key techniques to create impressive data visualizations with professional efficiency and precision.
Table of Contents (19 chapters)
Title Page
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Installing and loading graphics packages


Before starting, there are some habits you may want to cultivate in order to keep improving your R skills. First of all, whenever you program there may be some challenges to face. Usually those are tackled either by out-thinking the problem or by doing some research. You might want to remember what the problem was about and the solution, be that for times you face it again later or even for studying hours, keep a record of problems and solutions.

Note

Speaking for me, making a library-like folder and gathering some commented examples on problems and resolutions was, and still is, of great help. Naming files properly and taking good use of comments (# are used to assign comments with R) makes the revision much easier.

R Markdowndocuments are pretty useful if want to keep a track of your own development and optionally publish for others to see. Publishing the learning process is a good way to self-promote. Also, keep in mind that R is a programming language and often those can correctly pull a problem out in more than one way, be open-minded to seek different solutions.

First things firstin order to make good use of a package, you need to install the package and know how to call a package's function.

Note

If your R Session is running for a long time, there is a good chance that a bunch of packages are already loaded. Before installing or updating a package it's a good practice to restart R so that the installation won't mess with related loaded packages.

How to do it...

Run the following code to install the graphics packages properly:

> install.packages(c('devtools','plotly','ggvis'))
> devtools::install_github('hadley/ggplot2')

How it works...

Most of the book covers three graphic packages—ggplot2plotly, and ggvis. In order to install a new package, you can type the function install.packages() into the console. That function works for packages available at CRAN-like repositories and local files. In order to install packages from local files, you need to name more than just the first argument. Entering ?install.packages into RStudio console shall lead you to the function documentation at the Helptab.

Instants after running the recipe, all the packages (devtools included) covered in this chapter might already be properly installed. Check the Packagestab in your RStudio application (speed up the search by typing into the search engine); if everything went fine, these four may be shown under UserLibrary. Following image shows how it might look like:

Figure 1.1 - RStudio package window (bottom right corner).

If it fails, you may want to check the spelling and the internet connection. This function also gives some outputs that stand for warnings, progress reports, and results. Look for a message similar to  package '<Package Name>' successfully unpacked and MD5 sums checked to make sure that all went fine. Checking the output is a good practice in order to know if the plan worked. It also give good clues about troubleshooting.

You may want to call a non-existing package (be creative here) and a package already installed and see what happens. Sometimes incompatibilities avoid proper download and installation.For example, missing Java or the proper architecture of Java may prevent you from installing the rJava package.

Realize that a package's name must be in the string format in order to work (remember to use ' '). It's also important to check the spelling. The function (calling and arguments) is case sensitive; if you miss even one letter or case, you will not find the desired package. Also note that the arguments where drew into a c() function. That is a vector (try ?c in the console).

Note

?sign is actually a function that comes along base package called utils. Typing ?<function name> will always lead you to documentation whenever there is one to display. All functions coming from CRAN packages, base R and maybe the majority of GitHub ones have related documentation files, yet, if it's not base R do not forget to have the respective package already loaded. Alternatively you can also make calls like this: ?<package name>::<function name>.

As first argument of the install.packages() function, a vector of strings was given. That said, multiple packages can be downloaded and installed simultaneously. The same function might not install only the packages asked, but all the packages each of them rely on.

Note

Once the packages are installed, you have a bunch of new functions at your disposal. In order to get to know these functions, you can seek the packages' documentation online. Usually, the documentations can be found at repositories (CRAN, GitHub, and so on).

Now with a bunch of new functions at hand, the next step is to call a function from a specific package. There are several ways of doing that. One possible way to do it is typing <package name>::<package function>, latest code block done that when called install_github(), a function from coming from devtools package, so it was called this way: devtools::install_github().

There are pros and cons about calling a function this way. As for pros, you mostly avoid any name conflict that could possible happen between packages. Other than that, you also avoid loading the whole package when you only need to call a single function. Thus, calling a function this way may be useful in two occasions:

  • Name conflict is expected
  • Only few functions from that package may be requested and only a few times

Otherwise, if a package is required many times, typing <package name>:: before every function is anti-productive. It's possible to load and attach the whole package at once. Via RStudio interface, right below the window that shows environment objects, there is a window with a package tab. Below the package tab it's possible to check the box in order to load a package and uncheck to detach them.

Try to detach ggplot2 by unchecking the box; keep an eye on that box. You can load packages using functions. The require() and library() functions can be assigned to this task. Both don't need ' ' in order to function well like install.packages() does, but if you call the package name as a string it stills works. Note that both functions can only load one package a time.

Although require() and library() work in a very similar way, they do not work exactly the same. If require() fails it throws a warning, library() on the other hand will trow an error. There is morerequire() returns a logical value that stands for TRUE when the load succeeds and FALSE when it fails; library() returns no value.

For common loading procedures that is not a difference that should made into account, but if you want to create a function or loop that depends on loading a package and checking if it succeed, you may find easier to make it using require(). Using the logical operator & (and), it's possible to load all three packages at once and store the resultin a single variable. Calling this variable will state TRUE if there is success for all and FALSE if a single one fails. This is done as follows:

> lcheck <- require(ggplot2) & require(plotly) & require(ggvis)
> lcheck

Note

lcheck won't tell you which and how many packages failed. Try assigning c( require(ggplot2), require(plotly), reqruire(ggvis)) instead. Each element returning a FALSE is the package that is giving you trouble; this means better chances at troubleshooting.

For now you might be able to install R packages - from CRAN, Git repositories or local files - load and call a functions from an specific package. Now that you are familiar with R package's installation and loading procedures, the next section gives an introduction to the ggplot2 package framework.

There's more

Installation is also possible via RStudio features, which may seen more user friendly for newcomers. Open your RStudio, go to Tools > Install Packages..., type the packages' names (separate them with space or comma), and hit install. It fills the install.package() function and shows it in your console.

This is most indicated when you are not absolutely sure about the package name, but have a good clue. There is automatic suggestion thing that shall help you out to figure exactly what the package name is. You can also install packages from local files by using this feature. Look for an option called Install from and switch it to Package Archive File instead of Repository.

RStudios also gives you a Check For Packages Updates... option right below Install Packages... Hit it once in a while to make sure your packages are properly updated. Along with the packages to be updated it also shows what is new about them.

See also...