Book Image

Building Statistical Models in Python

By : Huy Hoang Nguyen, Paul N Adams, Stuart J Miller
Book Image

Building Statistical Models in Python

By: Huy Hoang Nguyen, Paul N Adams, Stuart J Miller

Overview of this book

The ability to proficiently perform statistical modeling is a fundamental skill for data scientists and essential for businesses reliant on data insights. Building Statistical Models with Python is a comprehensive guide that will empower you to leverage mathematical and statistical principles in data assessment, understanding, and inference generation. This book not only equips you with skills to navigate the complexities of statistical modeling, but also provides practical guidance for immediate implementation through illustrative examples. Through emphasis on application and code examples, you’ll understand the concepts while gaining hands-on experience. With the help of Python and its essential libraries, you’ll explore key statistical models, including hypothesis testing, regression, time series analysis, classification, and more. By the end of this book, you’ll gain fluency in statistical modeling while harnessing the full potential of Python's rich ecosystem for data analysis.
Table of Contents (22 chapters)
1
Part 1:Introduction to Statistics
7
Part 2:Regression Models
10
Part 3:Classification Models
13
Part 4:Time Series Models
17
Part 5:Survival Analysis

The Rank-Sum test

When the assumptions of the t-test are not met, the Rank-Sum test is often a good non-parametric alternative test. While the t-test can be used to test for the difference between the means of two distributions, the Rank-Sum test is used to test for the difference between the locations of two distributions. This difference in the test utility is due to the lack of parametric assumptions in the Rank-Sum test. The null hypothesis of the Rank-Sum test is that the distribution underlying the first sample is the same as the second sample. If the sample distributions appear to be similar, this allows us to use the Rank-Sum test to test for the difference in the locations of the two samples. As stated, the Rank-Sum test cannot specifically be used for testing the difference between means because it does not require assumptions about the sample distributions.

The test statistic procedure

The test procedure is straightforward. The process is outlined here and an example...