Book Image

Practical Data Science Cookbook, Second Edition - Second Edition

By : Prabhanjan Narayanachar Tattar, Bhushan Purushottam Joshi, Sean Patrick Murphy, ABHIJIT DASGUPTA, Anthony Ojeda
Book Image

Practical Data Science Cookbook, Second Edition - Second Edition

By: Prabhanjan Narayanachar Tattar, Bhushan Purushottam Joshi, Sean Patrick Murphy, ABHIJIT DASGUPTA, Anthony Ojeda

Overview of this book

As increasing amounts of data are generated each year, the need to analyze and create value out of it is more important than ever. Companies that know what to do with their data and how to do it well will have a competitive advantage over companies that don’t. Because of this, there will be an increasing demand for people that possess both the analytical and technical abilities to extract valuable insights from data and create valuable solutions that put those insights to use. Starting with the basics, this book covers how to set up your numerical programming environment, introduces you to the data science pipeline, and guides you through several data projects in a step-by-step format. By sequentially working through the steps in each chapter, you will quickly familiarize yourself with the process and learn how to apply it to a variety of situations with examples using the two most popular programming languages for data analysis—R and Python.
Table of Contents (17 chapters)
Title Page
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Preface

Installing libraries in R and RStudio


R has an incredible number of libraries that add to its capabilities. In fact, R has become the default language for many college and university statistics departments across the country. Thus, R is often the language that will get the first implementation of newly developed statistical algorithms and techniques. Luckily, installing additional libraries is easy, as you will see in the following sections.

Getting ready

As long as you have R or RStudio installed, you should be ready to go.

How to do it...

R makes installing additional packages simple:

  1. Launch the R interactive environment or, preferably, RStudio.
  2. Let's install ggplot2. Type the following command, and then press the Enter key:
install.packages("ggplot2")

Note

Note that for the remainder of the book, it is assumed that, when we specify entering a line of text, it is implicitly followed by hitting the Return or Enter key on the keyboard

  1. You should now see text similar to the following as you scroll down the screen:
trying URL 'http://cran.rstudio.com/bin/macosx/contrib/3.0/
 ggplot2_0.9.3.1.tgz'Content type 'application/x-gzip' length 2650041 bytes (2.5 
 Mb) 
opened URL 
================================================== 
downloaded 2.5 Mb 
 
The downloaded binary packages are in 
/var/folders/db/z54jmrxn4y9bjtv8zn_1zlb00000gn/T//Rtmpw0N1dA/
 downloaded_packages
  1. You might have noticed that you need to know the exact name, in this case, ggplot2, of the package you wish to install. Visit http://cran.us.r-project.org/web/packages/available_packages_by_name.html to make sure you have the correct name.
  2. RStudio provides a simpler mechanism to install packages. Open up RStudio if you haven't already done so.

  1. Go to Tools in the menu bar and select Install Packages .... A new window will pop up, as shown in the following screenshot:

  1. As soon as you start typing in the Packages field, RStudio will show you a list of possible packages. The autocomplete feature of this field simplifies the installation of libraries. Better yet, if there is a similarly named library that is related, or an earlier or newer version of the library with the same first few letters of the name, you will see it.
  2. Let's install a few more packages that we highly recommend. At the R prompt, type the following commands:
install.packages("lubridate") 
install.packages("plyr") 
install.packages("reshape2")

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com . If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files E-mailed directly to you.

How it works...

Whether you use RStudio's graphical interface or the install.packages command, you do the same thing. You tell R to search for the appropriate library built for your particular version of R. When you issue the command, R reports back the URL of the location where it has found a match for the library in CRAN and the location of the binary packages after download.

There's more...

R's community is one of its strengths, and we would be remiss if we didn't briefly mention two things. R-bloggers is a website that aggregates R-related news and tutorials from over 750 different blogs. If you have a few questions on R, this is a great place to look for more information. The Stack Overflow site ( http://www.stackoverflow.com ) is a great place to ask questions and find answers on R using the tag rstats.

Finally, as your prowess with R grows, you might consider building an R package that others can use. Giving an in-depth tutorial on the library building process is beyond the scope of this book, but keep in mind that community submissions form the heart of the R movement.

See also

You can also refer to the following: