Book Image

Feature Engineering Made Easy

By : Sinan Ozdemir, Divya Susarla
Book Image

Feature Engineering Made Easy

By: Sinan Ozdemir, Divya Susarla

Overview of this book

Feature engineering is the most important step in creating powerful machine learning systems. This book will take you through the entire feature-engineering journey to make your machine learning much more systematic and effective. You will start with understanding your data—often the success of your ML models depends on how you leverage different feature types, such as continuous, categorical, and more, You will learn when to include a feature, when to omit it, and why, all by understanding error analysis and the acceptability of your models. You will learn to convert a problem statement into useful new features. You will learn to deliver features driven by business needs as well as mathematical insights. You'll also learn how to use machine learning on your machines, automatically learning amazing features for your data. By the end of the book, you will become proficient in Feature Selection, Feature Learning, and Feature Optimization.
Table of Contents (14 chapters)
Title Page
Copyright and Credits
Packt Upsell

Chapter 5. Feature Selection

We're halfway throughour text and we have gotten our hands dirty with about a dozen datasets and have seen a great deal of feature selection methods that we, as data scientists and machine learning engineers, may utilize in our work and lives to ensure that we are getting the most out of our predictive modeling. So far, in dealing with data, we have worked with methods including:

  • Feature understanding through the identification of levels of data
  • Feature improvements and imputing missing values
  • Feature standardization and normalization

Each of the preceding methods has a place in our data pipeline and, more often than not, two or more methods are used in tandem with one another.

The remainder of this text will focus on other methods of feature engineering that are, by nature, a bit more mathematical and complex than in the first half of this book. As the preceding workflow grows, we will do our best to spare the reader the inner workings of each and every statistical...