Book Image

Building Statistical Models in Python

By : Huy Hoang Nguyen, Paul N Adams, Stuart J Miller
Book Image

Building Statistical Models in Python

By: Huy Hoang Nguyen, Paul N Adams, Stuart J Miller

Overview of this book

The ability to proficiently perform statistical modeling is a fundamental skill for data scientists and essential for businesses reliant on data insights. Building Statistical Models with Python is a comprehensive guide that will empower you to leverage mathematical and statistical principles in data assessment, understanding, and inference generation. This book not only equips you with skills to navigate the complexities of statistical modeling, but also provides practical guidance for immediate implementation through illustrative examples. Through emphasis on application and code examples, you’ll understand the concepts while gaining hands-on experience. With the help of Python and its essential libraries, you’ll explore key statistical models, including hypothesis testing, regression, time series analysis, classification, and more. By the end of this book, you’ll gain fluency in statistical modeling while harnessing the full potential of Python's rich ecosystem for data analysis.
Table of Contents (22 chapters)
1
Part 1:Introduction to Statistics
7
Part 2:Regression Models
10
Part 3:Classification Models
13
Part 4:Time Series Models
17
Part 5:Survival Analysis

Chi-square distribution

Researchers are often faced with the need to test hypotheses on categorical data. The parametric tests covered in Chapter 4, Parametric Tests, are often not very helpful for this type of analysis. In the last chapter, we discussed using an F-test to compare sample variances. Extending that concept, we can consider the non-parametric and non-symmetric chi-square probability distribution, which is a distribution useful for comparing the means of sampling distribution variances to their population variances, specifically when the mean of a sampling distribution of sample variances is expected to equal the population variance under the null hypothesis. Because variance cannot be negative, the distribution starts at an origin of 0. Here, we can see the chi-square distribution:

Figure 5.5 – Chi-square distribution with seven degrees of freedom

Figure 5.5 – Chi-square distribution with seven degrees of freedom

The shape of the chi-square distribution does not represent an assumption that percentiles...