Book Image

Hands-On Time Series Analysis with R

By : Rami Krispin
Book Image

Hands-On Time Series Analysis with R

By: Rami Krispin

Overview of this book

Time-series analysis is the art of extracting meaningful insights from, and revealing patterns in, time-series data using statistical and data visualization approaches. These insights and patterns can then be utilized to explore past events and forecast future values in the series. This book explores the basics of time-series analysis with R and lays the foundation you need to build forecasting models. You will learn how to preprocess raw time-series data and clean and manipulate data with packages such as stats, lubridate, xts, and zoo. You will analyze data using both descriptive statistics and rich data visualization tools in R including the TSstudio, plotly, and ggplot2 packages. The book then delves into traditional forecasting models such as time-series linear regression, exponential smoothing (Holt, Holt-Winter, and more) and Auto-Regressive Integrated Moving Average (ARIMA) models with the stats and forecast packages. You'll also work on advanced time-series regression models with machine learning algorithms such as random forest and Gradient Boosting Machine using the h2o package. By the end of this book, you will have developed the skills necessary for exploring your data, identifying patterns, and building a forecasting model using various traditional and machine learning methods.
Table of Contents (14 chapters)

Correlation between two variables

One of the main goals of correlation analysis is to identify and quantify the relationship between two variables. This relationship could vary from having a full dependency or linear relationship between the two, to complete independence. One of the most popular methods for measuring the level of correlation between two variables is the Pearson correlation coefficient. Although this method is not necessarily the most appropriate one for time series data, it is a simple and intuitive representative of the statistical logic beyond most of the methods for measuring correlation. This method, also known as the population correlation coefficient, is a ratio between the covariance of two variables and the multiplication of their standard deviation:

The values of the correlation coefficient segment the level of correlation into three main groups:

  • Positively...