Book Image

Feature Engineering Made Easy

By : Sinan Ozdemir, Divya Susarla
Book Image

Feature Engineering Made Easy

By: Sinan Ozdemir, Divya Susarla

Overview of this book

Feature engineering is the most important step in creating powerful machine learning systems. This book will take you through the entire feature-engineering journey to make your machine learning much more systematic and effective. You will start with understanding your data—often the success of your ML models depends on how you leverage different feature types, such as continuous, categorical, and more, You will learn when to include a feature, when to omit it, and why, all by understanding error analysis and the acceptability of your models. You will learn to convert a problem statement into useful new features. You will learn to deliver features driven by business needs as well as mathematical insights. You'll also learn how to use machine learning on your machines, automatically learning amazing features for your data. By the end of the book, you will become proficient in Feature Selection, Feature Learning, and Feature Optimization.
Table of Contents (14 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface

The types of feature selection


Recall that our goal with feature selection is to improve our machine learning capabilities by increasing predictive power and reducing the time cost. To do this, we introduce two broad categories of feature selection: statistical-based and model-based. Statistical-based feature selection will rely heavily on statistical tests that are separate from our machine learning models in order to select features during the training phase of our pipeline. Model-based selection relies on a preprocessing step that involves training a secondary machine learning model and using that model's predictive power to select features.

Both of these types of feature selection attempt to reduce the size of our data by subsetting from our original features only the best ones with the highest predictive power. We may intelligently choose which feature selection method might work best for us, but in reality, a very valid way of working in this domain is to work through examples of each...