Book Image

Feature Engineering Made Easy

By : Sinan Ozdemir, Divya Susarla
Book Image

Feature Engineering Made Easy

By: Sinan Ozdemir, Divya Susarla

Overview of this book

Feature engineering is the most important step in creating powerful machine learning systems. This book will take you through the entire feature-engineering journey to make your machine learning much more systematic and effective. You will start with understanding your data—often the success of your ML models depends on how you leverage different feature types, such as continuous, categorical, and more, You will learn when to include a feature, when to omit it, and why, all by understanding error analysis and the acceptability of your models. You will learn to convert a problem statement into useful new features. You will learn to deliver features driven by business needs as well as mathematical insights. You'll also learn how to use machine learning on your machines, automatically learning amazing features for your data. By the end of the book, you will become proficient in Feature Selection, Feature Learning, and Feature Optimization.
Table of Contents (14 chapters)
Title Page
Copyright and Credits
Packt Upsell

Dimension reduction – feature transformations versus feature selection versus feature construction

In the last section, I mentioned how we could squish datasets to have fewer columns to describe data in new ways. This sounds similar to the concept of feature selection: removing columns from our original dataset to create a different, potentially better, views of our dataset by cutting out the noise and enhancing signal columns. While both feature selection and feature transformation are methods of performing dimension reduction, it is worth mentioning that they could not be more different in their methodologies. 

Feature selection processes are limited to only being able to select features from the original set of columns, while feature transformation algorithms use these original columns and combine them in useful ways to create new columns that are better at describing the data than any single column from the original dataset. Therefore, feature selection methods reduce dimensions by isolating...