Book Image

Hands-On Time Series Analysis with R

By : Rami Krispin
Book Image

Hands-On Time Series Analysis with R

By: Rami Krispin

Overview of this book

Time-series analysis is the art of extracting meaningful insights from, and revealing patterns in, time-series data using statistical and data visualization approaches. These insights and patterns can then be utilized to explore past events and forecast future values in the series. This book explores the basics of time-series analysis with R and lays the foundation you need to build forecasting models. You will learn how to preprocess raw time-series data and clean and manipulate data with packages such as stats, lubridate, xts, and zoo. You will analyze data using both descriptive statistics and rich data visualization tools in R including the TSstudio, plotly, and ggplot2 packages. The book then delves into traditional forecasting models such as time-series linear regression, exponential smoothing (Holt, Holt-Winter, and more) and Auto-Regressive Integrated Moving Average (ARIMA) models with the stats and forecast packages. You'll also work on advanced time-series regression models with machine learning algorithms such as random forest and Gradient Boosting Machine using the h2o package. By the end of this book, you will have developed the skills necessary for exploring your data, identifying patterns, and building a forecasting model using various traditional and machine learning methods.
Table of Contents (14 chapters)

Getting started with R

R is an open source and free programming language for statistical computing and graphics. With more than 13,500 indexed packages (as of May 2019, as you can see in the following graph) and a large number of applications for statistics, machine learning, data mining, and data visualizations, R is one of the most popular statistical programming languages. One of the main reasons for the fast growth of R in recent years is the open source structure of R, where users are also the main package developers. Among the package developers, you can find individuals like us, as well as giant companies such as Microsoft, Google, and Facebook. This reduces the dependency of the users significantly with any specific company (as opposed to traditional statistical software), allowing for fast knowledge sharing and a diverse portfolio of solutions.

The following graph shows the amount packages that have been shared on CRAN over time:

You can see that, whenever we come across any statistical problem, it is likely that someone has already faced the same problem and developed a package with a solution (and if not, you should create one!). Furthermore, there are a vast amount of packages for time series analysis, from tools for data preparations and visualization to advance statistical modeling applications. Packages such as forecast, stats, zoo, xts, and lubridate made R the leading software for time series analysis. In the A brief introduction to R section in this chapter, we will discuss the key packages we will use throughout this book in more detail.

Now, we will learn how to install R.

Installing R

To install R on Windows, Mac, or Linux, go to the Comprehensive R Archive Network (CRAN) main page at https://cran.r-project.org/, where you can select the relevant operating system.

For Windows users, the installation file includes both the 32-bit and the 64-bit versions. You can either install one of the versions or the hybrid version, which includes both the 32-bit and 64-bit versions. Technically, after the installation, you can start working with R using the built-in Integrated Development Environment (IDE).

However, it is highly recommended to install the RStudio IDE and set it as your working environment for R. RStudio will make your code writing and debugging and the use of visualization tools or other applications easier and simple.

RStudio offers a free version of its IDE, which is available at https://www.rstudio.com/products/rstudio/download/.