Overview of this book

Feature engineering is the most important step in creating powerful machine learning systems. This book will take you through the entire feature-engineering journey to make your machine learning much more systematic and effective. You will start with understanding your data—often the success of your ML models depends on how you leverage different feature types, such as continuous, categorical, and more, You will learn when to include a feature, when to omit it, and why, all by understanding error analysis and the acceptability of your models. You will learn to convert a problem statement into useful new features. You will learn to deliver features driven by business needs as well as mathematical insights. You'll also learn how to use machine learning on your machines, automatically learning amazing features for your data. By the end of the book, you will become proficient in Feature Selection, Feature Learning, and Feature Optimization.
Title Page
Packt Upsell
Contributors
Preface
Feature Understanding – What's in My Dataset?
Feature Improvement - Cleaning Datasets
Feature Construction
Feature Learning
Case Studies
Other Books You May Enjoy

A deeper look into the principal components

Before we take a look at our second feature transformation algorithm, it is important to take a look at how principal components are interpreted:

1. Our `iris` dataset is a 150 x 4 matrix, and when we calculated our PCA components when `n_components` was set to `2`, we obtained a components matrix of size `2 x 4`:
```# how to interpret and use components
pca.components_ # a 2 x 4 matrix

array([[ 0.52237162, -0.26335492, 0.58125401, 0.56561105], [ 0.37231836, 0.92555649, 0.02109478, 0.06541577]])```
1. Just like in our manual example of calculating eigenvectors, the `components_` attribute can be used to project data using matrix multiplication. We do so by multiplying our original dataset with the transpose of the `components_ matrix`:
```# Multiply original matrix (150 x 4) by components transposed (4 x 2) to get new columns (150 x 2)
np.dot(X_scaled, pca.components_.T)[:5,]

array([[-2.26454173, 0.5057039 ], [-2.0864255 , -0.65540473], [-2.36795045, -0.31847731], [-2...```