Book Image

Learning R Programming

By : Kun Ren
Book Image

Learning R Programming

By: Kun Ren

Overview of this book

R is a high-level functional language and one of the must-know tools for data science and statistics. Powerful but complex, R can be challenging for beginners and those unfamiliar with its unique behaviors. Learning R Programming is the solution - an easy and practical way to learn R and develop a broad and consistent understanding of the language. Through hands-on examples you'll discover powerful R tools, and R best practices that will give you a deeper understanding of working with data. You'll get to grips with R's data structures and data processing techniques, as well as the most popular R packages to boost your productivity from the offset. Start with the basics of R, then dive deep into the programming techniques and paradigms to make your R code excel. Advance quickly to a deeper understanding of R's behavior as you learn common tasks including data analysis, databases, web scraping, high performance computing, and writing documents. By the end of the book, you'll be a confident R programmer adept at solving problems with the right techniques.
Table of Contents (21 chapters)
Learning R Programming
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface

A quick example


In this section, I will demonstrate a simple example of computing, model fitting, and producing graphics by typing in commands in the console.

First, let's create vector x of 100 normally distributed random numbers. Then, create another vector y of 100 numbers, each of which is 3 times the corresponding element in x plus 2 and some random noise. Note that <- is the assignment operator, which we will cover later. I use str() to print the structure of the vectors:

x <- rnorm(100) y <- 2 + 3 * x + rnorm(100) * 0.5 str(x) 
##  num [1:100] -0.4458 -1.2059 0.0411 0.6394 -0.7866 ... 
str(y) 
##  num [1:100] -0.022 -1.536 2.067 4.348 -0.295 ... 

Since we know that the true relationship between X and Y is , we can run a simple linear regression on the sample X and Y and see how the linear model recovers the linear parameters (that is, 2 and 3) of the model. We call lm(y ~ x) to fit such a model:

model1 <- lm(y ~ x) 

The result of the model fitting is stored in an object named model1. We can view the model fit by simply typing model1 or explicitly typing print(model1):

model1 
##  
## Call: 
## lm(formula = y ~ x) 
##  
## Coefficients: 
## (Intercept)            x   
##       2.051        2.973 

If you want to see more details, call summary() with model1:

summary(model1) 
##  
## Call: 
## lm(formula = y ~ x) 
##  
## Residuals: 
##      Min       1Q   Median       3Q      Max  
## -1.14529 -0.30477  0.03154  0.30042  0.98045  
##  
## Coefficients: 
##             Estimate Std. Error t value Pr(>|t|)     
## (Intercept)  2.05065    0.04533   45.24   <2e-16 *** 
## x            2.97343    0.04525   65.71   <2e-16 *** 
## --- 
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
##  
## Residual standard error: 0.4532 on 98 degrees of freedom 
## Multiple R-squared:  0.9778, Adjusted R-squared:  0.9776 
## F-statistic:  4318 on 1 and 98 DF,  p-value: < 2.2e-16 

We can plot the points and the fitted model together:

plot(x, y, main = "Simple linear regression") 
abline(model1$coefficients, col = "blue") 

The preceding screenshot demonstrates some simple functions so that you can get a first impression of working with R. If you are not familiar with the symbols and functions in the example, don't worry: the next few chapters will cover the basic objects and functions you need to know.