#### Overview of this book

Explore the world of Business Intelligence through the eyes of an analyst working in a successful and growing company. Learn R through use cases supporting different functions within that company. This book provides data-driven and analytically focused approaches to help you answer questions in operations, marketing, and finance. In Part 1, you will learn about extracting data from different sources, cleaning that data, and exploring its structure. In Part 2, you will explore predictive models and cluster analysis for Business Intelligence and analyze financial times series. Finally, in Part 3, you will learn to communicate results with sharp visualizations and interactive, web-based dashboards. After completing the use cases, you will be able to work with business data in the R programming environment and realize how data science helps make informed decisions and develops business strategy. Along the way, you will find helpful tips about R and Business Intelligence.
Introduction to R for Business Intelligence
Credits
Acknowledgement
www.PacktPub.com
Preface
Free Chapter
Exploratory Data Analysis
Data Mining with Cluster Analysis
Time Series Analysis
Visualizing the Datas Story
Web Dashboards with Shiny
References
R Packages Used in the Book
R Code for Supporting Market Segment Business Case Calculations

## Building ARIMA time series models

The term ARIMA is made up of the letters that represent a modeling approach for time series data. ARIMA models contain the following three elements:

• AR: Auto regressive, specified with p or P

• I: Integrated (differencing), specified with d or D

• MA: Moving average, specified with q or Q

Auto regressive means that earlier lagged points in the data influence later points in the sequence. This creates a dependence condition. The type of AR model chosen is based on how many steps away (lags) the points in the past affect the points in the future. Data that has a greater lingering effect on future points has a higher lag. The higher the lag, the higher the AR number. You will see models referred to as AR(1), AR(2), and so forth to represent an autoregressive model of the number of p lags specified in the parentheses.

Integrated refers to differencing that you learned earlier. The d value represents the number of differences used in the model. It is typically...