Book Image

The Supervised Learning Workshop - Second Edition

By : Blaine Bateman, Ashish Ranjan Jha, Benjamin Johnston, Ishita Mathur
Book Image

The Supervised Learning Workshop - Second Edition

By: Blaine Bateman, Ashish Ranjan Jha, Benjamin Johnston, Ishita Mathur

Overview of this book

Would you like to understand how and why machine learning techniques and data analytics are spearheading enterprises globally? From analyzing bioinformatics to predicting climate change, machine learning plays an increasingly pivotal role in our society. Although the real-world applications may seem complex, this book simplifies supervised learning for beginners with a step-by-step interactive approach. Working with real-time datasets, you’ll learn how supervised learning, when used with Python, can produce efficient predictive models. Starting with the fundamentals of supervised learning, you’ll quickly move to understand how to automate manual tasks and the process of assessing date using Jupyter and Python libraries like pandas. Next, you’ll use data exploration and visualization techniques to develop powerful supervised learning models, before understanding how to distinguish variables and represent their relationships using scatter plots, heatmaps, and box plots. After using regression and classification models on real-time datasets to predict future outcomes, you’ll grasp advanced ensemble techniques such as boosting and random forests. Finally, you’ll learn the importance of model evaluation in supervised learning and study metrics to evaluate regression and classification tasks. By the end of this book, you’ll have the skills you need to work on your real-life supervised learning Python projects.
Table of Contents (9 chapters)

4. Autoregression

Activity 4.01: Autoregression Model Based on Periodic Data

  1. Import the necessary packages, classes, and libraries.

    Note

    This activity will work on an earlier version of pandas, ensure that you downgrade the version of pandas using the command:

    pip install pandas==0.24.2

    The code is as follows:

    import pandas as pd
    import numpy as np
    from statsmodels.tsa.ar_model import AR
    from statsmodels.graphics.tsaplots import plot_acf
    import matplotlib.pyplot as plt
  2. Load the data and convert the Date column to datetime:
    df = pd.read_csv('../Datasets/austin_weather.csv')
    df.Date = pd.to_datetime(df.Date)
    print(df.head())
    print(df.tail())

    The output for df.head() should look as follows:

    Figure 4.22: Output for df.head()

    The output for df.tail() should look as follows:

    Figure 4.23: Output for df.tail()

  3. Plot the complete set of average temperature values (df.TempAvgF) with Date on the x axis:
    fig, ax = plt.subplots(figsize = (10, 7))
    ax.scatter(df.Date, df.TempAvgF)
    plt...