Understanding and applying feature engineering
Feature engineering is the general term that describes the process of transforming existing features in our dataset, creating missing features, and eventually selecting the most predictive features from our dataset to start the ML training process with a given ML algorithm. These cannot just be seen as some mathematical functions we must apply to our data. This is an art form and doing it well makes the difference between a mediocre and highly performing predictive model. If you want to understand where you should invest your time, feature engineering is the step where you can have the most impact on the quality of your final ML model. To create this impact and be efficient, we must consider the following:
- ML algorithm requirements: Do the features have to be in a specific format or range? How do I best avoid overfitting and underfitting the model?
- Domain knowledge: Are the given features sufficient for our model? Can we create...