Book Image

R for Data Science Cookbook (n)

By : Yu-Wei, Chiu (David Chiu)
Book Image

R for Data Science Cookbook (n)

By: Yu-Wei, Chiu (David Chiu)

Overview of this book

This cookbook offers a range of data analysis samples in simple and straightforward R code, providing step-by-step resources and time-saving methods to help you solve data problems efficiently. The first section deals with how to create R functions to avoid the unnecessary duplication of code. You will learn how to prepare, process, and perform sophisticated ETL for heterogeneous data sources with R packages. An example of data manipulation is provided, illustrating how to use the “dplyr” and “data.table” packages to efficiently process larger data structures. We also focus on “ggplot2” and show you how to create advanced figures for data exploration. In addition, you will learn how to build an interactive report using the “ggvis” package. Later chapters offer insight into time series analysis on financial data, while there is detailed information on the hot topic of machine learning, including data classification, regression, clustering, association rule mining, and dimension reduction. By the end of this book, you will understand how to resolve issues and will be able to comfortably offer solutions to problems encountered while performing data analysis.
Table of Contents (19 chapters)
R for Data Science Cookbook
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

The debugging function


As a programmer, debugging is the most common task faced on a daily basis. The simplest debugging method is to insert a print statement at every desired location; however, this method is rather inefficient. Here, we will illustrate how to use some R debugging tools to help accelerate the debugging process.

Getting ready

Make sure that you know how a function and works and how to create a new function.

How to do it...

Perform the following steps to debug an R function:

  1. First, we create a debugfunc function with x and y as argument, but we only return x:

    >debugfunc<- function(x, y){
    + x <- y + 2
    + x
    + }
    >debug(2)
    
  2. We then pass only 2 to dubugfunc:

    >debugfunc(2)
    Error in debugfunc(2) : argument "y" is missing, with no default
    
  3. Next, we apply the debug function onto debugfunc:

    >debug(debugfunc)
    
  4. At this point, we pass 2 to debugfunc again:

    >debugfunc(2)
    debugging in: debugfunc(2)
    debug at #1: {
    x <- y + 2
    x
    }
    
  5. You can type help to list all possible commands:

    Browse[2]> help
    n next
    s step into
    f finish
    c or cont continue
    Q quit
    where show stack
    help show help
    <expr> evaluate expression
    
  6. Then, you can type n to move on to the next debugging step:

    Browse[2]> n
    debug at #2: x <- y + 2
    
  7. At this point, you can use objects or ls to list variables:

    Browse[2]> objects()
    [1] "x" "y"
    Browse[2]>ls()
    [1] "x" "y"
    
  8. At each step, you can type the variable name to obtain the current value:

    Browse[2]> x
    [1] 2
    Browse[2]> y
    Error: argument "y" is missing, with no default
    
  9. At the last step, you can quit the debug mode by typing the Q command:

    Browse[2]> Q
    
  10. You can then leave the debug mode using the undebug function:

    >undebug(debugfunc)
    
  11. Moving on, let's debug the function using the browser function:

    debugfunc2<- function(x, y){
    x <- 3
    browser()
    x <- y + 2
    x
    }
    
  12. The debugger will then step right into where the browser function is located:

    >debugfunc2(2)
    Called from: debugfunc2(2)
    Browse[1]> n
    debug at #4: x <- y + 2
    
  13. To recover the debug process, type recover during the browsing process:

    Browse[2]> recover()
    Enter a frame number, or 0 to exit
    1: debugfunc2(2)
    Selection: 1
    Browse[4]> Q
    
  14. On the other hand, you can use the trace function to insert code into the debug function at step 4:

    >trace(debugfunc2, quote(if(missing(y)){browser()}), at=4)
    [1] "debugfunc2"
    
  15. You can then track the debugging process from step 4, and determine the inserted code:

    >debugfunc2(3)
    Called from: debugfunc2(3)
    Browse[1]> n
    debug at #4: {
    .doTrace(if (missing(y)) {
    browser()
    }, "step 4")
    x <- y + 2
    }
    Browse[2]> n
    debug: .doTrace(if (missing(y)) {
    browser()
    }, "step 4")
    Browse[2]> Q
    
  16. On the other hand, you can track the usage of certain functions with the trace function:

    >debugfunc3<- function(x, y){
    + x <- 3
    + sum(x)
    + x <- y + 2
    + sum(x,y)
    + x
    + }
    >trace(sum)
    >debugfunc3(2,3)
    trace: sum(x)
    trace: sum(x, y)
    [1] 5
    
  17. You can also print the calling stack of a function with the traceback function:

    >lm(y~x)
    Error in eval(expr, envir, enclos) : object 'y' not found
    >traceback()
    7: eval(expr, envir, enclos)
    6: eval(predvars, data, env)
    5: model.frame.default(formula = y ~ x, drop.unused.levels = TRUE)
    4: stats::model.frame(formula = y ~ x, drop.unused.levels = TRUE)
    3: eval(expr, envir, enclos)
    2: eval(mf, parent.frame())
    1: lm(y ~ x)
    

How it works...

As it is inevitable for all code to include bugs, an R programmer has to be well prepared for them with a good debugging toolset. In this recipe, we showed you how to debug a function with the debug, browser, trace, and traceback functions.

In the first section, we explained how to debug a function by applying debug to an existing function. We first made a function named debugfunc, with two input arguments: x and y. Then, we applied a debug function onto debugfunc. Here, we applied the debug function on the name, argument, or function. At this point, whenever we invoke debugfunc, our R console will enter into a browser mode with Browse as the prompt at the start of each line.

Browser mode enables us to make a single step through the execution of the function. We list the single-letter commands that one can use while debugging here:

Command

Meaning

c or cont (continue)

This executes all the code of the current function

n (next)

This evaluates the next statement, stepping over function calls

s (step into)

This evaluates the next statement, stepping into function calls

objects or ls

This lists all current objects

help

This lists all possible commands

where

This prints the stack trace of active function calls

f (finish)

This finishes the execution of current function

Q (quit)

This terminates the debugging mode

In the following operations, we first use help to list all possible commands. Then, we type n to step to the next line. Next, we type objects and ls to list all current objects. At this point, we can type the variable name to find out the current value of each object. Finally, we can type Q to quit debugging mode, and use undebug to unflag the function.

Besides using the debug function, we can insert the browser function within the code to debug it. After we have inserted browser() into debugfunc2, whenever we invoke the function, the R function will step right into the next line below the browser function. Here, we can perform any command mentioned in the previous command table. If you want to move among frames or return to the top level of debugging mode, we can use the recover function. Additionally, we can use the trace function to insert debugging code into the function. Here, we assign what to trace as debugfunc2, and set the tracer to examine whether y is missing. If y is missing, it will execute the browser() function. At that argument, we set 4 to the argument so that the tracer code will be inserted at line 4 of debugfunc2. Then, when we call the debugfunc2 function, the function enters right into where the tracer is located and executes the browser function as the y argument is missing.

Finally, we introduce the traceback function, which prints the calling stack of the function. At this step, we pass two unassigned parameters, x and y, into an lm linear model fitting function. As we do not assign any value to these two parameters, the function returns an error message in the console output. To understand the calling stack sequence, we can use the traceback function to print out the stack.

There's more...

Besides using the command line, we can use RStudio to debug functions:

  1. First, you select Toggle Breakpoint from the dropdown menu of Debug:

    Figure 1: Toggle Breakpoint

  2. Then, you set breakpoint on the left of line number:

    Figure 2: Set breakpoint

  3. Next, you save the code file and click on Source to activate the debugging process:

    Figure 3: Activate debugging process

  4. Finally, when you invoke the function, your R console will then enter into Browse mode:

    Figure 4: Browse the function

You can now use the command line or dropdown menu of Debug to debug the function:

Figure 5: Use debug functions