Book Image

Building Statistical Models in Python

By : Huy Hoang Nguyen, Paul N Adams, Stuart J Miller
Book Image

Building Statistical Models in Python

By: Huy Hoang Nguyen, Paul N Adams, Stuart J Miller

Overview of this book

The ability to proficiently perform statistical modeling is a fundamental skill for data scientists and essential for businesses reliant on data insights. Building Statistical Models with Python is a comprehensive guide that will empower you to leverage mathematical and statistical principles in data assessment, understanding, and inference generation. This book not only equips you with skills to navigate the complexities of statistical modeling, but also provides practical guidance for immediate implementation through illustrative examples. Through emphasis on application and code examples, you’ll understand the concepts while gaining hands-on experience. With the help of Python and its essential libraries, you’ll explore key statistical models, including hypothesis testing, regression, time series analysis, classification, and more. By the end of this book, you’ll gain fluency in statistical modeling while harnessing the full potential of Python's rich ecosystem for data analysis.
Table of Contents (22 chapters)
1
Part 1:Introduction to Statistics
7
Part 2:Regression Models
10
Part 3:Classification Models
13
Part 4:Time Series Models
17
Part 5:Survival Analysis

Measuring and describing distributions

The distributions of data found in the wild come in many shapes and sizes. This section will discuss how distributions are measured and which measurements apply to the four types of data. These measurements will provide methods to compare and contrast different distributions. The measurements discussed in this section can be broken into the following categories:

  • Central tendency
  • Variability
  • Shape

These measurements are called descriptive statistics. The descriptive statistics discussed in this section are commonly used in statistical summaries of data.

Measuring central tendency

There are three types of measurement of central tendency:

  • Mode
  • Median
  • Mean

Let’s discuss each one of them.

Mode

The first measurement of central tendency we will discuss is the mode. The mode of a dataset is simply the most commonly occurring instance. Using the machines in the factory as an example (see...