Book Image

Building Statistical Models in Python

By : Huy Hoang Nguyen, Paul N Adams, Stuart J Miller
Book Image

Building Statistical Models in Python

By: Huy Hoang Nguyen, Paul N Adams, Stuart J Miller

Overview of this book

The ability to proficiently perform statistical modeling is a fundamental skill for data scientists and essential for businesses reliant on data insights. Building Statistical Models with Python is a comprehensive guide that will empower you to leverage mathematical and statistical principles in data assessment, understanding, and inference generation. This book not only equips you with skills to navigate the complexities of statistical modeling, but also provides practical guidance for immediate implementation through illustrative examples. Through emphasis on application and code examples, you’ll understand the concepts while gaining hands-on experience. With the help of Python and its essential libraries, you’ll explore key statistical models, including hypothesis testing, regression, time series analysis, classification, and more. By the end of this book, you’ll gain fluency in statistical modeling while harnessing the full potential of Python's rich ecosystem for data analysis.
Table of Contents (22 chapters)
1
Part 1:Introduction to Statistics
7
Part 2:Regression Models
10
Part 3:Classification Models
13
Part 4:Time Series Models
17
Part 5:Survival Analysis

Multivariate Time Series

The models we discussed in the previous chapter only depended on the previous values of the single variable of interest. Those models are appropriate when we only have a single variable in our time series. However, it is common to have multiple variables in time-series data. Often, these other variables in the series can improve forecasting of the variable of interest. We will discuss models for time series with multiple variables in this chapter. We will first discuss the correlation relationship between time-series variables, then discuss how we can model multivariate time series. While there are many models for multivariate time-series data, we will discuss two models that are both powerful and widely used: autoregressive integrated moving average with exogenous variables (ARIMAX) and vector autoregressive (VAR). Understanding these two models will extend the reader’s model toolbox and provide building blocks for the reader to learn more about multivariate...