Book Image

Modern R Programming Cookbook

By : Jaynal Abedin
Book Image

Modern R Programming Cookbook

By: Jaynal Abedin

Overview of this book

R is a powerful tool for statistics, graphics, and statistical programming. It is used by tens of thousands of people daily to perform serious statistical analyses. It is a free, open source system whose implementation is the collective accomplishment of many intelligent, hard-working people. There are more than 2,000 available add-ons, and R is a serious rival to all commercial statistical packages. The objective of this book is to show how to work with different programming aspects of R. The emerging R developers and data science could have very good programming knowledge but might have limited understanding about R syntax and semantics. Our book will be a platform develop practical solution out of real world problem in scalable fashion and with very good understanding. You will work with various versions of R libraries that are essential for scalable data science solutions. You will learn to work with Input / Output issues when working with relatively larger dataset. At the end of this book readers will also learn how to work with databases from within R and also what and how meta programming helps in developing applications.
Table of Contents (10 chapters)

Installing R libraries from various sources

R library or packages refer to a collection of previously programmed functions for specific tasks. When you install base R, you will see that it comes with a number of default libraries installed, but users need to use customized libraries to solve their problems. In this recipe, you will see how you can install libraries from different sources, such as CRAN, GitHub, and Bioconductor (BioC).

Getting ready

Suppose you are interested in visualizing your data using the ggplot2 library, but when you call the library using the library(ggplot2) code, you end up getting an error saying that ggplot2 is not found. Now, you need to install ggplot2. In this recipe, you will install the following libraries from the sources mentioned:

  • The ggplot2 library from CRAN
  • The devtools library from CRAN
  • The dplyr library from GitHub
  • The GenomicFeatures library from BioC

How to do it…

Under the default utils library, there is a function called install.packages() to install a package from within the R console. You can use the command install.packages(). This command will prompt you to select the appropriate server CRAN.

To install packages using this approach, the computer must have an active internet connection.

The ggplot2 library

Lets take a look at the following steps to install the ggplot2 library:

  1. Open the R console or terminal and then type the following command:
      install.packages("ggplot2")

The preceding command line will then ask you to select a server as follows:

      install.packages("ggplot2")
--- Please select a CRAN mirror for use in this session ---
  1. It will now install ggplot2 and its dependent libraries. If you want to avoid selecting a mirror server, then you can specify the mirror server within the install.packages() function using repos=.

The devtools library

This is another library that extends the functionalities of the utils library of base R. This library is convenient for developing various tools within R, and using this library, you can install the required library from GitHub. To install devtools along with its dependent libraries, use the install.packages() function as you did for ggplot2.

Installing a library from GitHub

To install any library from GitHub, you can use the install_github() function from the devtools library. You just need to know the name of the library and the GitHub ID of the repository owner. See the following installation code for the dplyr library from GitHub:

    library(devtools)
install_github("hadley/dplyr")

Installing a library from the BioC repository

To install any library from the BioC repository, you have to use the biocLite.R file and then use the biocLite() function to install the library. The following code snippet is to install the GenomicFeatures library from the BioC repository:

    source(https://bioconductor.org/biocLite.R)
biocLite("GenomicFeatures")

How it works…

In any of the commands to install a library, either install.packages(), install_github(), or biocLite(), first, it connects with the mirror server where the source code / binary-released version of the specified library is located. Then, it checks whether the dependent libraries are installed onto the computer or not. If the required dependent library is absent, then it will download and install those required libraries before installing the one you specified through the command. The command will also search for the location where the installed library will be stored. You can explicitly specify the location or you can use the default. The recommended approach is to specify a location and install all customized libraries into that folder.

To specify the installation location, you can use the lib= option within the function. Make sure you have created the folder that you are going to use as the destination folder. Here is an example:

    install.packages("ggplot2", lib="c:/rPackages")

There's more…

Using devtools, you can install from either CRAN or from GitHub or even from the BioC repository. There are specific functions available within the devtools library as follows:

  • install_github()
  • install_bioc()
  • install_bitbucket()
  • install_cran()
  • install_git()

See also

The user might need a specific version of a library to do a certain task, and that version could be an older one. To install the specific version of the library, see the next recipe that talks about it.