Book Image

Building Statistical Models in Python

By : Huy Hoang Nguyen, Paul N Adams, Stuart J Miller
Book Image

Building Statistical Models in Python

By: Huy Hoang Nguyen, Paul N Adams, Stuart J Miller

Overview of this book

The ability to proficiently perform statistical modeling is a fundamental skill for data scientists and essential for businesses reliant on data insights. Building Statistical Models with Python is a comprehensive guide that will empower you to leverage mathematical and statistical principles in data assessment, understanding, and inference generation. This book not only equips you with skills to navigate the complexities of statistical modeling, but also provides practical guidance for immediate implementation through illustrative examples. Through emphasis on application and code examples, you’ll understand the concepts while gaining hands-on experience. With the help of Python and its essential libraries, you’ll explore key statistical models, including hypothesis testing, regression, time series analysis, classification, and more. By the end of this book, you’ll gain fluency in statistical modeling while harnessing the full potential of Python's rich ecosystem for data analysis.
Table of Contents (22 chapters)
1
Part 1:Introduction to Statistics
7
Part 2:Regression Models
10
Part 3:Classification Models
13
Part 4:Time Series Models
17
Part 5:Survival Analysis

When parametric test assumptions are violated

In the previous chapter, we discussed parametric tests. Parametric tests have strong statistical power but also require adherence to strong assumptions. When the assumptions are not satisfied, the test results are not valid. Fortunately, we have alternative tests that can be used when the assumptions of a parametric test are not satisfied. These tests are called non-parametric tests, meaning that they make no assumptions about the underlying distribution of the data. While non-parametric tests do not require distributional assumptions, these tests will still require the samples to be independent.

Permutation tests

For the first non-parametric test, let’s look more deeply at the definition of a p-value. A p-value is the probability of obtaining a test statistic at least as extreme as the observed value under the assumption of the null hypothesis. Then, to calculate a p-value, we need the null distribution and an observed statistic...