Book Image

R for Data Science Cookbook (n)

By : Yu-Wei, Chiu (David Chiu)
Book Image

R for Data Science Cookbook (n)

By: Yu-Wei, Chiu (David Chiu)

Overview of this book

This cookbook offers a range of data analysis samples in simple and straightforward R code, providing step-by-step resources and time-saving methods to help you solve data problems efficiently. The first section deals with how to create R functions to avoid the unnecessary duplication of code. You will learn how to prepare, process, and perform sophisticated ETL for heterogeneous data sources with R packages. An example of data manipulation is provided, illustrating how to use the “dplyr” and “data.table” packages to efficiently process larger data structures. We also focus on “ggplot2” and show you how to create advanced figures for data exploration. In addition, you will learn how to build an interactive report using the “ggvis” package. Later chapters offer insight into time series analysis on financial data, while there is detailed information on the hot topic of machine learning, including data classification, regression, clustering, association rule mining, and dimension reduction. By the end of this book, you will understand how to resolve issues and will be able to comfortably offer solutions to problems encountered while performing data analysis.
Table of Contents (19 chapters)
R for Data Science Cookbook
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

Handling errors in a function


If you are familiar with modern programming languages, you may have experience with how to use try, catch, and finally, block, to handle possible errors during development. Likewise, R provides similar error-handling operations in its functions. Thus, you can add error-handling mechanisms into R code to make programs more robust. In this recipe, we will introduce some basic error-handling functions in R.

Getting ready

Ensure that you completed the previous recipes by installing R on your operating system.

How to do it...

Perform the following steps to handle errors in an R function:

  1. First, let's observe what an error message looks like:

    > 'hello world' + 3
    Error in "hello world" + 3 : non-numeric argument to binary operator
    
  2. In a user-defined function, we can also print out the error message using stop if something beyond our expectation happens:

    >addnum<- function(a,b){
    + if(!is.numeric(a) | !is.numeric(b)){
    + stop("Either a or b is not numeric")
    + }
    + a + b
    + }
    >addnum(2,3)
    [1] 5
    >addnum("hello world",3)
    Error in addnum("hello world", 3) : Either a or b is not numeric
    
  3. Now, let's see what happens if we replace the stop function with a warning function:

    >addnum2<- function(a,b){
    + if(!is.numeric(a) | !is.numeric(b)){
    + warning("Either a or b is not numeric")
    + }
    + a + b
    + }
    >addnum2("hello world",3)
    Error in a + b : non-numeric argument to binary operator
    In addition: Warning message:
    In addnum2("hello world", 3) : Either a or b is not numeric
    
  4. We can also see what happens if we replace the stop function with a warning function:

    >options(warn=2)
    >addnum2("hello world", 3)
    Error in addnum2("hello world", 3) :
    (converted from warning) Either a or b is not numeric
    
  5. To suppress warnings, we can wrap the function to invoke with a suppressWarnings function:

    >suppressWarnings(addnum2("hello world",3))
    Error in a + b : non-numeric argument to binary operator
    
  6. We can also use the try function to catch the error message:

    >errormsg<- try(addnum("hello world",3))
    Error in addnum("hello world", 3) : Either a or b is not numeric
    >errormsg
    [1] "Error in addnum("hello world", 3) : Either a or b is not numeric\n"
    attr(,"class")
    [1] "try-error"
    attr(,"condition")
    <simpleError in addnum("hello world", 3): Either a or b is not numeric>
    
  7. By setting the silent option, we can suppress the error message displayed on the console:

    >errormsg<- try(addnum("hello world",3), silent=TRUE)
    
  8. Furthermore, we can use the try function to prevent interrupting the for-loop. Here, we show a for-loop without using the try function:

    >iter<- c(1,2,3,'O',5)
    >res<- rep(NA, length(iter))
    >for (i in 1:length(iter)) {
    + res[i] = as.integer(iter[i])
    + }
    Error: (converted from warning) NAs introduced by coercion
    >res
    [1] 1 2 3 NA NA
    
  9. Now, let's see what happens if we insert the try function into the code:

    >iter<- c(1,2,3,'O',5)
    >res<- rep(NA, length(iter))
    >for (i in 1:length(iter)) {
    + res[i] = try(as.integer(iter[i]), silent=TRUE)
    + }
    >res
    [1] "1"
    [2] "2"
    [3] "3"
    [4] "Error in try(as.integer(iter[i]), silent = TRUE) : \n (converted from warning) NAs introduced by coercion\n"
    [5] "5"
    
  10. For arguments, we can use the stopifnot function to check the argument:

    >addnum3<- function(a,b){
    + stopifnot(is.numeric(a), !is.numeric(b))
    + a + b
    + }
    >addnum3("hello", "world")
    Error: is.numeric(a) is not TRUE
    
  11. To handle all kinds of errors, we can use the tryCatch function for error handling:

    >dividenum<- function(a,b){
    + result<- tryCatch({
    + print(a/b)
    + }, error = function(e) {
    + if(!is.numeric(a) | !is.numeric(b)){
    + print("Either a or b is not numeric")
    + }
    + }, finally = {
    + rm(a)
    + rm(b)
    + print("clean variable")
    + }
    + )
    + }
    >dividenum(2,4)
    [1] 0.5
    [1] "clean variable"
    >dividenum("hello", "world")
    [1] "Either a or b is not numeric"
    [1] "clean variable"
    >dividenum(1)
    Error in value[[3L]](cond) : argument "b" is missing, with no default
    [1] "clean variable"
    

How it works...

Similar to other programming languages, R provides developers with an error-handling mechanism. However, the error-handling mechanism in R is implemented in the function instead of a pure code block. This is due to the fact that all operations are pure function calls.

In the first step, we demonstrate what will output if we add an integer to a string. If the operation is invalid, the system will print an error message on the console. There are three basic types of error handling messages in R, which are error, warning, and interrupt.

Next, we create a function named addnum, which is designed to return the addition of two arguments. However, sometimes you will pass an unexpected type of input (for example, string) into a function. For this condition, we can add an argument type check condition before the return statement. If none of the input data types is numeric, the stop function will print an error message quoted in the stop function.

Besides using the stop function, we can use a warning function instead to handle an error. However, only using a warning function, the function process will not terminate but proceed to return a + b. Thus, we might find both an error and warning message displayed on the console. To suppress the warning message, we can set warn=2 in the options function, or we can use suppressWarnings instead to mute the warning message. On the other hand, we can also use the stopifnot function to check whether the argument is valid or not. If the input argument is invalid, we can stop the program and print an error message on the screen.

Moving on, we can catch the error using the try function. Here, we store the error message into errormsg in the operation of adding a character string to an integer. However, the function will still print the error message on the screen. We can mute the message by setting a silent argument to TRUE. Furthermore, the try function is very helpful if don't want a for-loop being interrupted by unexpected errors. Therefore, we first demonstrate how an error may unexpectedly interrupt the loop execution. In that step, we may find that the loop execution stops, and we have successfully assigned only three variables to res. However, we can actually proceed with the for-loop execution by wrapping the code into a try function.

Besides the try function, we can use a more advanced error-handling function, tryCatch, to handle errors including warning and error. We use the tryCatch function in the following manner:

tryCatch({
result<- expr
}, warning = function(w) {
# handling warning
}, error = function(e) {
# handling error
}, finally = {
#Cleanup
})

In this function, we can catch warning and error messages in different function code blocks. By following the function form, we can create a function named dividenum. The function first performs numeric division; if any error occurs, we can catch the error and print an error message in the error function. At the end of the block, we remove any defined value within the function and print the message of clean variable. At this point, we can test how this function works in three different situations: performing a normal division, dividing a string from a string, and passing only one parameter into the function. We can now observe the output message under different conditions. In the first condition, the function prints out the division result, followed by clean variable because it is coded in the block of finally. For the second condition, the function first catches the error of missing value in the error block and then outputs clean variable at the end. For the last condition, while we do not catch the error of not passing a value to the b parameter, the function still returns an error message first and then prints clean variable on the console.

There's more...

If you want to catch the error message while using the tryCatch function, you can put a conditionMessage to the error argument of the tryCatch function:

>dividenum<- function(a,b){
+ result<- tryCatch({
+ a/b
+ }, error = function(e) {
+ conditionMessage(e)
+ }
+ )
+ result
+ }
>dividenum(3,5)
[1] 0.6
>dividenum(3,"hello")
[1] "non-numeric argument to binary operator"

In this example, if you pass two valid numeric arguments to the dividenum function, the function returns a computation of 3/5 as output. On the other hand, if you pass a non-numeric value to the function, the function catches the error with the conditionMessage function and returns the error as the function output.