Book Image

The Statistics and Calculus with Python Workshop

By : Peter Farrell, Alvaro Fuentes, Ajinkya Sudhir Kolhe, Quan Nguyen, Alexander Joseph Sarver, Marios Tsatsos
5 (1)
Book Image

The Statistics and Calculus with Python Workshop

5 (1)
By: Peter Farrell, Alvaro Fuentes, Ajinkya Sudhir Kolhe, Quan Nguyen, Alexander Joseph Sarver, Marios Tsatsos

Overview of this book

Are you looking to start developing artificial intelligence applications? Do you need a refresher on key mathematical concepts? Full of engaging practical exercises, The Statistics and Calculus with Python Workshop will show you how to apply your understanding of advanced mathematics in the context of Python. The book begins by giving you a high-level overview of the libraries you'll use while performing statistics with Python. As you progress, you'll perform various mathematical tasks using the Python programming language, such as solving algebraic functions with Python starting with basic functions, and then working through transformations and solving equations. Later chapters in the book will cover statistics and calculus concepts and how to use them to solve problems and gain useful insights. Finally, you'll study differential equations with an emphasis on numerical methods and learn about algorithms that directly calculate values of functions. By the end of this book, you’ll have learned how to apply essential statistics and calculus concepts to develop robust Python applications that solve business challenges.
Table of Contents (14 chapters)
Preface

3. Python's Statistical Toolbox

Activity 3.01: Revisiting the Communities and Crimes Dataset

Solution

  1. The libraries can be imported, and pandas can be used to read in the dataset as follows:
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import seaborn as sns
    df = pd.read_csv('CommViolPredUnnormalizedData.txt')
    df.head()

    Your output should be the following:

    Figure 3.29: The first five rows of the dataset

  2. To replace the special character with the np.nan object, we can use the following code:
    df = df.replace('?', np.nan)
  3. To compute the actual count for the different age groups, we can simply use the expression df['population'] * df['agePct...'], which computes the count in a vectorized way:
    age_groups = ['12t21', '12t29', '16t24', '65up']
     
    for group in age_groups:
        df['ageCnt' + group] = (df['population'] * \
      ...