Book Image

Web Application Development with R Using Shiny Second Edition - Second Edition

By : Chris Beeley
Book Image

Web Application Development with R Using Shiny Second Edition - Second Edition

By: Chris Beeley

Overview of this book

R is a highly flexible and powerful tool for analyzing and visualizing data. Most of the applications built using various libraries with R are desktop-based. But what if you want to go on the web? Here comes Shiny to your rescue! Shiny allows you to create interactive web applications using the excellent analytical and graphical capabilities of R. This book will guide you through basic data management and analysis with R through your first Shiny application, and then show you how to integrate Shiny applications with your own web pages. Finally, you will learn how to finely control the inputs and outputs of your application, along with using other packages to build state-of-the-art applications, including dashboards.
Table of Contents (14 chapters)
Web Application Development with R Using Shiny Second Edition
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

Data types and structures


There are many data types and structures of data within R. The following topics summarize some of the main types and structures that you will use when building Shiny applications.

Dataframes, lists, arrays, and matrices

Dataframes have several important features, which make them useful for data analysis:

  • Rectangular data structures with the typical use being cases (for example, days in one month) down the rows and variables (page views, unique visitors, or referrers) along the columns.

  • A mix of data types is supported. A typical dataframe might include variables containing dates, numbers (integers or floats), and text.

  • With subsetting and variable extraction, R provides a lot of built-in functionality to select rows and variables within a dataframe.

  • Many functions include a data argument, which makes it very simple to pass dataframes into functions and process only the variables and cases that are relevant, which makes for cleaner and simpler code.

We can inspect the first few rows of the dataframe using the head(analyticsData) command. The following screenshot shows the output of this command:

As you can see, there are four variables within the dataframe—one contains dates, two contain integer variables, and one contains a numeric variable. There is more about variable types in R shown in the following paragraphs.

Variables can be extracted from dataframes very simply using the $ operator as follows:

> analyticsData$pageViews
 [1] 836 676 940 689 647 899 934 718 776 570 651 816
[13] 731 604 627 946 634 990 994 599 657 642 894 983
[25] 646 540 756 989 965 821

Also, variables can be extracted from dataframes using [], as shown in the following command:

> analyticsData[, "pageViews"]

Note the use of the comma with nothing before it to indicate that all rows are required. In general, dataframes can be accessed using dataObject[x,y] with x being the number(s) or name(s) of the rows required and y being the number(s) or name(s) of the columns required. For example, if the first 10 rows were required from the pageViews column, it could be achieved like this:

> analyticsData[1:10,"pageViews"]
[1] 836 676 940 689 647 899 934 718 776 570

Leaving the space before the comma blank returns all rows, and the space after the comma blank returns all variables. For example, the following command returns the first three rows of all variables:

> analyticsData[1:3,]

The following screenshot shows the output of this command:

Dataframes are a special type of list. Lists can hold many different types of data including lists. As with many data types in R, their elements can be named, which can be useful to write code that is easy to understand. Let's make a list of the options for dinner, with drink quantities expressed in milliliters.

In the following example, please note also the use of the c() function, which is used to produce vectors and lists by giving their elements separated by commas. R will pick an appropriate class for the return value, string for vectors that contain strings, numeric for those that only contain numbers, logical for Boolean values, and so on:

> dinnerList <- list("Vegetables" =
  c("Potatoes", "Cabbage", "Carrots"),
  "Dessert" = c("Ice cream", "Apple pie"),
  "Drinks" = c(250, 330, 500)
)

Note

Note that code is indented throughout, although entering directly into the console will not produce indentations; it is done for readability.

Indexing is similar to dataframes (which are, after all, just a special instance of a list). They can be indexed by number, as shown in the following command:

> dinnerList[1:2]
$Vegetables
[1] "Potatoes" "Cabbage"  "Carrots"

$Dessert
[1] "Ice cream" "Apple pie"

This returns a list. Returning an object of the appropriate class is achieved using [[]]:

> dinnerList[[3]]
[1] 250 330 500

In this case a numeric vector is returned. They can be indexed also by name:

> dinnerList["Drinks"]
$Drinks
[1] 250 330 500

Note that this, also, returns a list.

Matrices and arrays, which, unlike dataframes, only hold one type of data, also make use of square brackets for indexing, with analyticsMatrix[, 3:6] returning all rows of the third to sixth column, analyticsMatrix[1, 3] returning just the first row of the third column, and analyticsArray[1, 2, ] returning the first row of the second column across all of the elements within the third dimension.

Variable types

R is a dynamically typed language and you are not required to declare the type of your variables. It is worth knowing, of course, about the different types of variable that you might read or write using R. The different types of variable can be stored in a variety of structures, such as vectors, matrices, and dataframes, although some restrictions apply as detailed previously (for example, matrices must contain only one variable type):

  • Declaring a variable with at least one string in will produce a vector of strings (in R, the character data type):

    > c("First", "Third", 4, "Second")
    [1] "First"  "Third"  "4"  "Second"
    

    You will notice that the numeral 4 is converted to a string, "4". This is as a result of coercion, in which elements of a data structure are converted to other data types in order to fit within the types allowed within the data structure. Coercion occurs automatically, as in this case, or with an explicit call to the as() function, for example, as.numeric(), or as.Date().

  • Declaring a variable with just numbers will produce a numeric vector:

    > c(15, 10, 20, 11, 0.4, -4)
    [1] 15.0 10.0 20.0 11.0  0.4 -4.0
    
  • R includes, of course, also a logical data type:

    > c(TRUE, FALSE, TRUE, TRUE, FALSE)
    [1]  TRUE FALSE  TRUE  TRUE FALSE
    
  • A data type exists for dates, often a source of problems for beginners:

    > as.Date(c("2013/10/24", "2012/12/05", "2011/09/02"))
    [1] "2013-10-24" "2012-12-05" "2011-09-02"
    
  • The use of the factor data type tells R all of the possible values of a categorical variable, such as gender or species:

    > factor(c("Male", "Female", "Female", "Male", "Male"),
      levels = c("Female", "Male")
    [1] Male   Female Female Male   Male
    Levels: Female Male
    

Functions

As you grow in confidence with R you will wish to begin writing your own functions. This is achieved very simply and in a manner quite reminiscent of many other languages. You will no doubt wish to read more about writing functions in R in a fuller treatment, but just to give you an idea, here is a function called the sumMultiply function which adds together x and y and multiplies by z:

sumMultiply <- function(x, y, z){
  final = (x+y) * z
  return(final)
}

This function can now be called using sumMultiply(2, 3, 6), which will return 2 plus 3 times 6, which gives 30.

Objects

There are many special object types within R which are designed to make it easier to analyze data. Functions in R can be polymorphic, that is to say they can respond to different data types in different ways in order to produce the output that the user desires. For example, the plot() function in R responds to a wide variety of data types and objects, including single dimension vectors (each value of y plotted sequentially) and two-dimensional matrices (producing a scatterplot), as well as specialized statistical objects such as regression models and time series data. In the latter case, plots specialized for these purposes are produced.

As with the rest of this introduction, don't worry if you haven't written functions before, or don't understand object concepts and aren't sure what this all means. You can produce great applications without understanding all these things, but as you do more and more with R you will start to want to learn more detail about how R works and how experts produce R code. This introduction is designed to give you a jumping off point to learn more about how to get the best out of R (and Shiny).