Book Image

Feature Engineering Made Easy

By : Sinan Ozdemir, Divya Susarla
Book Image

Feature Engineering Made Easy

By: Sinan Ozdemir, Divya Susarla

Overview of this book

Feature engineering is the most important step in creating powerful machine learning systems. This book will take you through the entire feature-engineering journey to make your machine learning much more systematic and effective. You will start with understanding your data—often the success of your ML models depends on how you leverage different feature types, such as continuous, categorical, and more, You will learn when to include a feature, when to omit it, and why, all by understanding error analysis and the acceptability of your models. You will learn to convert a problem statement into useful new features. You will learn to deliver features driven by business needs as well as mathematical insights. You'll also learn how to use machine learning on your machines, automatically learning amazing features for your data. By the end of the book, you will become proficient in Feature Selection, Feature Learning, and Feature Optimization.
Table of Contents (14 chapters)
Title Page
Copyright and Credits
Packt Upsell

Chapter 4. Feature Construction

In the previous chapter, we worked with the Pima Indian Diabetes Prediction dataset to get a better understanding of which given features in our dataset are most valuable. Working with the features that were available to us, we identified missing values within our columns and employed techniques of dropping missing values, imputing, and normalizing/standardizing our data to improve the accuracy of our machine learning model.

It is important to note that, up to this point, we have only worked with features that are quantitative. We will now shift into dealing with categorical data, in addition to the quantitative data that has missing values. Our main focus will be to work with our given features to construct entirely new features for our models to learn from. 

There are various methods we can utilize to construct our features, with the most basic starting with the pandas library in Python to scale an existing feature by a multiples. We will be diving into some...