Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Python Feature Engineering Cookbook
  • Table Of Contents Toc
Python Feature Engineering Cookbook

Python Feature Engineering Cookbook - Third Edition

By : Galli
close
close
Python Feature Engineering Cookbook

Python Feature Engineering Cookbook

By: Galli

Overview of this book

Streamline data preprocessing and feature engineering in your machine learning project with this third edition of the Python Feature Engineering Cookbook to make your data preparation more efficient. This guide addresses common challenges, such as imputing missing values and encoding categorical variables using practical solutions and open source Python libraries. You’ll learn advanced techniques for transforming numerical variables, discretizing variables, and dealing with outliers. Each chapter offers step-by-step instructions and real-world examples, helping you understand when and how to apply various transformations for well-prepared data. The book explores feature extraction from complex data types such as dates, times, and text. You’ll see how to create new features through mathematical operations and decision trees and use advanced tools like Featuretools and tsfresh to extract features from relational data and time series. By the end, you’ll be ready to build reproducible feature engineering pipelines that can be easily deployed into production, optimizing data preprocessing workflows and enhancing machine learning model performance.
Table of Contents (14 chapters)
close
close

Performing equal-width discretization

Equal-width discretization consists of dividing the range of observed values for a variable into k equally sized intervals, where k is supplied by the user. The interval width for the X variable is given by the following:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow><mrow><mi mathvariant="bold-italic">W</mi><mi mathvariant="bold-italic">i</mi><mi mathvariant="bold-italic">d</mi><mi mathvariant="bold-italic">t</mi><mi mathvariant="bold-italic">h</mi><mo>=</mo><mfrac><mrow><mi mathvariant="bold-italic">M</mi><mi mathvariant="bold-italic">a</mi><mi mathvariant="bold-italic">x</mi><mfenced open="(" close=")"><mi mathvariant="bold-italic">X</mi></mfenced><mo>−</mo><mi mathvariant="bold-italic">M</mi><mi mathvariant="bold-italic">i</mi><mi mathvariant="bold-italic">n</mi><mo>(</mo><mi mathvariant="bold-italic">X</mi><mo>)</mo></mrow><mi mathvariant="bold-italic">k</mi></mfrac></mrow></mrow></math>

Then, if the values of the variable vary between 0 and 100, we can create five bins like this: width = (100-0) / 5 = 20. The bins will be 0–20, 20–40, 40–60, and 80–100. The first and final bins (0–20 and 80–100) can be expanded to accommodate values smaller than 0 or greater than 100 by extending the limits to minus and plus infinity.

In this recipe, we will carry out equal-width discretization using pandas, scikit-learn, and feature-engine.

How to do it...

First, let’s import the necessary Python libraries and get the dataset ready:

  1. Let’s import the libraries and functions:
    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    from...
CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Python Feature Engineering Cookbook
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist download Download options font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon