Book Image

Learning R Programming

By : Kun Ren
Book Image

Learning R Programming

By: Kun Ren

Overview of this book

R is a high-level functional language and one of the must-know tools for data science and statistics. Powerful but complex, R can be challenging for beginners and those unfamiliar with its unique behaviors. Learning R Programming is the solution - an easy and practical way to learn R and develop a broad and consistent understanding of the language. Through hands-on examples you'll discover powerful R tools, and R best practices that will give you a deeper understanding of working with data. You'll get to grips with R's data structures and data processing techniques, as well as the most popular R packages to boost your productivity from the offset. Start with the basics of R, then dive deep into the programming techniques and paradigms to make your R code excel. Advance quickly to a deeper understanding of R's behavior as you learn common tasks including data analysis, databases, web scraping, high performance computing, and writing documents. By the end of the book, you'll be a confident R programmer adept at solving problems with the right techniques.
Table of Contents (21 chapters)
Learning R Programming
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface

Preface

R is designed for statistical computing, data analysis, and visualization. In recent years, it has become the most popular language for data science and statistics. R programming heavily involves data processing and it can be a challenge to program in R for those who are unfamiliar with the behaviors of the R language.

As a dynamic language, R allows extremely flexible use of data structures that are not as strict as compiled languages, such as C++, Java, and C#. When I started using R to process and analyze data, I found R’s behavior quirky, unpredictable, and sometimes very inconsistent.

In those data analysis projects, most effort was not spent running models. Instead, data cleaning, wrangling, and visualization took a major part of my time. In fact, it is most time consuming to find what’s wrong with the code that produced weird results or died in unexpected errors. Dealing with programming rather than field problems can be frustrating, especially when you have fought against bugs for hours without a clue.

However, as I work on more projects and gain more experience, I gradually know more about the behavior of objects and functions, and find that R is much more beautiful and consistent than I thought. That’s why I've written this book—to share my perspective on programming in R.

Through this book, you will develop a universal and consistent understanding of R as a programming language along with its vast set of tools. You will learn the best practices to boost your productivity, develop a deeper understanding of working with data, and become more confident about programming in R and solving problems with the right techniques.

What this book covers

Chapter 1, Quick Start, discusses a few basic facts about R, how to deploy an R environment, and how to code in RStudio.

Chapter 2, Basic Objects, introduces basic R objects and their behaviors.

Chapter 3, Managing Your Workspace, introduces the methods of managing the working directory, R environment, and the library of extension packages.

Chapter 4, Basic Expressions, covers the basic expressions of the R language: assignment, condition, and loop.

Chapter 5, Working with Basic Objects, discusses the basic functions each analyst should know in order to work with basic objects in R.

Chapter 6, Working with Strings, talks about R objects related with strings, and a number of string manipulation techniques.

Chapter 7, Working with Data, explains simple read/write data functions with some practical examples on various topics using basic objects and functions.

Chapter 8, Inside R, discusses R’s evaluation model by introducing what lazy evaluation, environment, function, and lexical scoping work is.

Chapter 9, Metaprogramming, introduces the metaprogramming techniques to help understand language objects and nonstandard evaluation.

Chapter 10, Object-Oriented Programming, describes the numerous object-oriented programming systems in R: S3, S4, RefClass, and community-provided R6.

Chapter 11, Working with Databases, shows how R works with popular relational databases such as SQLite and MySQL, and non-relational databases such as MongoDB and Redis.

Chapter 12, Data Manipulation, introduces techniques of manipulating relational data using data.table and dplyr, and non-relational data using rlist.

Chapter 13, High Performance Computing, discusses performance issues in R and several methods to boost computing performance.

Chapter 14, Web Scraping, talks about the basic structure of web pages, CSS, and XPath selectors and how to use the rvest package to scrape data from simple web pages.

Chapter 15, Boosting Productivity, demonstrates how R Markdown and shiny app, combined with interactive graphics, can boost productivity in the reporting and presentation of data analysis.

What you need for this book

To run the example code in this book, you will need to install R 3.3.0 or newer. RStudio is the recommended development environment.

For Chapter 11, Working with Databases, a working MongoDB server and a Redis instance is required to run examples.

For Chapter 13, High Performance Computing, Rtools 3.3 is required to build an Rcpp code under Windows, and a gcc toolchain is required under Linux or macOS.

Who this book is for

This book targets those who work on data-related projects and want to boost productivity but may not be familiar with the programming language and related tools.

This book also targets professional data analysts who want to systematically learn the R programming language, related techniques, and recommended packages and practices.

Although several chapters are a bit advanced for beginners, you don't have to be a computer expert or a professional data analyst to read those chapters, but I assume you will have a general idea of basic programming concepts and a basic experience of data processing.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "The apply function  also supports array input and matrix output."

The style of inline code words (variables and function names) and code blocks is set as follows:

x <- c(1, 2, 3)
class(x)
## [1] "numeric"
typeof(x)
## [1] "double"
str(x)
##  num [1:3] 1 2 3

There will be a highlight on certain areas of the code whenever a point is being pointed out:

x <- rnorm(100)
y <- 2 * x + rnorm(100) * 0.5
m <- lm(y ~ x)
coef(m)

New terms and important words are shown in bold. 

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

  1. Log in or register to our website using your e-mail address and password.

  2. Hover the mouse pointer on the SUPPORT tab at the top.

  3. Click on Code Downloads & Errata.

  4. Enter the name of the book in the Search box.

  5. Select the book for which you're looking to download the code files.

  6. Choose from the drop-down menu where you purchased this book from.

  7. Click on Code Download.

You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

  • WinRAR / 7-Zip for Windows

  • Zipeg / iZip / UnRarX for Mac

  • 7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/learningrprogramming. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at [email protected] with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.