Book Image

Learning R Programming

By : Kun Ren
Book Image

Learning R Programming

By: Kun Ren

Overview of this book

R is a high-level functional language and one of the must-know tools for data science and statistics. Powerful but complex, R can be challenging for beginners and those unfamiliar with its unique behaviors. Learning R Programming is the solution - an easy and practical way to learn R and develop a broad and consistent understanding of the language. Through hands-on examples you'll discover powerful R tools, and R best practices that will give you a deeper understanding of working with data. You'll get to grips with R's data structures and data processing techniques, as well as the most popular R packages to boost your productivity from the offset. Start with the basics of R, then dive deep into the programming techniques and paradigms to make your R code excel. Advance quickly to a deeper understanding of R's behavior as you learn common tasks including data analysis, databases, web scraping, high performance computing, and writing documents. By the end of the book, you'll be a confident R programmer adept at solving problems with the right techniques.
Table of Contents (21 chapters)
Learning R Programming
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface

RStudio


RStudio is a powerful user interface for R programming. It's free, open source, and works on multiple platforms including Windows, Mac, and Linux.

RStudio has very powerful features that hugely boost your productivity in data analysis and visualization. It supports syntax highlighting, autocompletion, multi-tabbed views, file management, graphics viewport, package management, integrated help viewer, code formatting, version control, interactive debugging, and many more features.

You can download the latest release of RStudio at https://www.rstudio.com/products/rstudio/download. If you want to try the preview version with new features, download it from https://www.rstudio.com/products/rstudio/download/preview. Note that RStudio does not include R, so you need to make sure that you have R installed while working in RStudio.

In followings sections, I'll give you a brief introduction to the user interface of RStudio.

RStudio's user interface

The following screenshot shows the RStudio user interface in the Windows operating system. If you are using Mac OS X or a supported version of Linux, the screen should look almost the same.

You may notice that the main window consists of several parts. Each part is called a pane and performs different functions. These panes are well designed for data analysts to work with data.

The console

The following screenshot shows the R console embedded in RStudio. In most cases, the console works exactly like a Command Prompt or terminal. In fact, when you type in a command at the console, RStudio will submit the request to the R engine. It is the R engine that executes all the commands. The role of RStudio is to stand in the middle, take inputs from user to the R engine, and present the results it returns.

Using the console, you can easily execute a command, define a variable, or evaluate an expression interactively to compute a statistical measure, transform data, or produce charts.

The editor

Typing in commands at the console is not the usual way we work with data. Instead, we write scripts, a set of commands representing a logic flow that can be read from a file and executed by the R engine. The editor is useful for editing R scripts, markdown documents, web pages, many types of configuration files, and even C++ source code.

The functionality of the code editor is much more than a plain text editor: it supports syntax highlighting, autocompletion of R code, debugging with breakpoints, and so on. More specifically, when editing R scripts you can use the following shortcut keys:

  • Press Ctrl + Enter to execute the selected lines

  • Press Ctrl + Shift + S to source the current document, that is, to evaluate all the expressions sequentially in the current document

  • Press Tab or Ctrl + Space to show an autocompletion list of variables and functions that match your current typing

  • Click on the left margin of a line number and set a breakpoint; now, the next time this line is executed, the program will pause and wait for you to check

The Environment pane

The Environment pane shows the variables and functions that you have created and are available for repeated use. By default, it shows the variables in the global environment, that is, the user workspace in which you are working.

Each time you create a new object (a variable or function), a new entry will appear in the Environment pane. The entry shows the variable name and a short description of its value. When you change the value of a symbol or even remove that symbol, you actually modify the environment so that the environment pane reflects your change.

The History pane

The History pane shows the previous expressions evaluated in the console. You can repeat the task performed previously by simply pressing up in the console.

The history may be stored in the .Rhistory file in the working directory.

The File pane

The File pane shows the files in the folder. You can navigate between folders, create new folders, delete or rename folders or files, and so on.

If you are working on an RStudio project, the File pane is handy for viewing and organizing project files.

The Plots pane

The Plots pane is used to show graphics produced by R code. If you produce more than one plot, the previous ones are stored and you can navigate back and forth to view all plots (until you clear them).

When you resize the plot pane, graphics will adapt to its size so that they look as nice as they did before resizing. You can also export a plot to a file for future use.

The Packages pane

Much of R's power derives from its packages. The Packages pane shows all installed packages. You can also easily install or update packages from CRAN or remove an existing package from your library.

The Help pane

A lot of R's power also derives from its detailed documentation. The Help pane shows the documentation so that you can easily learn how to use functions.

There are numerous ways to View a function's documentation:

  • Type the function name in the Search box and find it directly

  • Type the function name in the console and press F1

  • Type ? before the function name and execute it

In practice, you don't have to remember all of R's functions; you only need to remember how to get help with a function you are not familiar with.

The Viewer pane

The Viewer pane is a new feature; it was introduced as an increasing number of R packages combine the functionality of both R and existing JavaScript libraries to make rich and interactive presentations of data.

The following screenshot is an example of my formattable (http://renkun.me/formattable) package that provides a simple implementation of conditional formatting in Excel with data frames in R:

RStudio Server

If you are using a supported version of Linux, you can easily set up a server version of RStudio, or RStudio Server. It runs on a host server (probably much more powerful and stable than your laptop) and you can run an R session in RStudio in your web browser. The user interface is mostly the same but you have access to the computing and memory resources of the server, as if you were using a local computer.