Book Image

R Object-oriented Programming

By : Kelly Black
Book Image

R Object-oriented Programming

By: Kelly Black

Overview of this book

Table of Contents (19 chapters)
R Object-oriented Programming
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
4
Calculating Probabilities and Random Numbers
Package Management
Index

Preface

The R environment is a powerful software suite that started as a model for the S language originally developed at Bell Laboratories. The original code base was created by Ross Ihaka and Robert Gentleman in 1993. It rapidly grew with the help of others, and it has since become a standard in statistical computing. The software suite itself has grown well beyond an implementation of a language and has become an "environment". It is extensible, and the wide variety of packages that are available help make it a powerful resource that continues to grow in popularity and power.

Our aim in this book is to provide a resource for programming using the R language, and we assume that you will be making use of the R environment to implement and test your code. The book can be roughly divided into four parts. In the first part, we provide a discussion of the basic ideas and topics necessary to understand how R classifies data and the options that can be used to make calculations from data. In the second part, we provide a discussion of how R organizes data and the options available to keep track of data, display data, and read and save data. In the third part, we provide a discussion on programming topics specific to the R language and the options available for object-oriented programming. In the fourth part, we provide several extended examples as a way to demonstrate how all of the topics can fit together to solve problems.

What this book covers

A list of the chapters is given here. The first three chapters focus on the basic requirements associated with getting data into the system and the most basic tasks associated with calculations associated with data. The next three chapters focus on the miscellaneous issues that arise in practice when working with and examining data including the mechanics of dealing with different data types. The next three chapters focus on basic and advanced programming topics. The final three chapters provide more detailed examples to demonstrate how all of the ideas can be brought together to solve problems.

Chapter 1, Data Types, offers a broad overview of the different data types. This includes basic representations such as float, double, complex, factors, and integer representations, and it also includes examples of how to enter vectors through the interactive shell. A brief discussion of the most basic operations and how to interact with the R shell is also given.

Chapter 2, Organizing Data, offers a more detailed look at the way data is organized within the R environment. Additional topics include how to access the data as well as how to perform basic operations on the various data structures. The primary data structures examined are lists, arrays, tables, and data frames.

Chapter 3, Saving Data and Printing Results, offers a detailed look at the ways to bring data into the R environment and builds on the topics discussed in the previous chapter. Additional topics revolve around the ways to display results as well as various ways to save data.

Chapter 4, Calculating Probabilities and Random Numbers, offers a detailed examination of the probability and sampling features of the R language. The R environment includes a number of features to aid in the way data can be analyzed. Any statistical analysis includes an underlying reliance on probability, and it is a topic that cannot be ignored. The availability of a wide variety of probability and sampling options is one of the strengths of the R language, and we explore some of the options in this chapter.

Chapter 5, Character and String Operations, offers a detailed examination of the various options available for examining, testing, and performing operations on strings. This is an important topic because it is not uncommon for datasets to have inconsistencies, and a routine that reads data from a file should include some basic checks.

Chapter 6, Converting and Defining Time Variables, offers a detailed examination of the time data structure. A basic introduction is given in the first chapter, and more details are provided in this chapter. The prevalence of time-related data makes the topic of these data structures too important to ignore.

Chapter 7, Basic Programming , offers a detailed examination of the most basic flow controls and programming features of the R language. The chapter provides details about conditional execution as well as the various looping constructs. Additionally, mundane topics associated with writing programs, execution, and formatting are also discussed.

Chapter 8, S3 Classes, offers a detailed examination of S3 classes. This is the first and most common approach to object-oriented programming. The use of S3 classes can be confusing to people already familiar with object-oriented programming, but their flexibility has made them a popular way to approach object-oriented programming in R.

Chapter 9, S4 Classes, offers a detailed examination of S4 classes. This is a more recent approach to object-oriented programming compared to S3 classes. It is a more structured approach and is more familiar to people who have experience with object-oriented programming.

Chapter 10, Case Study – Course Grades, offers an in-depth example of a grade-tracking application. This is the first of three examples, and it is the simplest example. It was chosen as it is something that is likely to be more familiar to a wider range of people.

Chapter 11, Case Study – Simulation, offers an in-depth example of an application that is used to generate data based on Monte-Carlo simulations. The application demonstrates how an object-oriented approach can be used to create an environment used to execute simulations, organize the results, and perform a basic analysis on the results.

Chapter 12, Case Study – Regression, offers an in-depth example of an application that offers a wide range of options you can use to perform regression. Regression is a common task and occurs in a wide variety of contexts. The application that is developed demonstrates a flexible way to handle both continuous and ordinal data as a way to demonstrate the use of a flexible object-oriented approach. You can download this chapter form https://www.packtpub.com/sites/default/files/downloads/6682OS_Case_Study_Regression.pdf.

Appendix, Package Management, gives a brief overview of installing, updating, and removing packages is given. Packages are libraries that can be added to R that extend its capabilities. Being able to extend R and make use of other libraries represents one R's more powerful features.

What you need for this book

It is assumed that you will be working in the R environment, and the example code has been developed and tested for R version 3.0.1 and later. The R environment is a type of free software and is made available through the efforts and generosity of the R Foundation. It can be downloaded from http://www.r-project.org/. The material in the first half of the book assumes that you have access to R and can work from the interactive command line within the R environment. The material in the second half of the book assumes that you are familiar with programming and can write and save computer code. At a minimum, you should have access to a programming editor and should be familiar with directory structures and search paths.

Who this book is for

If you are familiar with programming and wish to gain a basic understanding of the R environment and learn how to create programming applications using the R language, this is the book for you. It is assumed that you have some exposure to the R environment and have a basic understanding of R. This book does not provide extensive motivations for certain approaches and practices assuming that the reader is comfortable in the development of software applications.

Conventions

In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "A list is created using the list command, and a variable can be tested or coerced using the is.list and as.list commands."

A block of code is set as follows:

> x = rnorm(5,mean=10,sd=3)
> x
[1] 11.172719  8.784284 10.074035  5.735171 10.800138
> pnorm(abs(x-10),mean=0,sd=3)-pnorm(-abs(x-10),mean=0,sd=3)
[1] 0.30413363 0.31469803 0.01968849 0.84486037 0.21030971
>

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

> v <- c(1,3,5,7,-10)
> v
[1]   1   3   5   7 -10
> v[4]
[1] 7
> v[2] <- v[1]-v[5]
> v
[1]   1  11   5   7 -10

New terms and important words are shown in bold.

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.

To send us general feedback, simply send an e-mail to , and mention the book title via the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. An additional source for the examples in this book can be found at https://github.com/KellyBlack/R-Object-Oriented-Programming. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.

Copyright violations

Violation of copyright laws for material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at with a link to the suspected pirated material.

We appreciate your help in protecting our authors, and our ability to bring you valuable content.

Questions

You can contact us at if you are having a problem with any aspect of the book, and we will do our best to address it.