In this chapter, we explored the data preparation techniques to obtain a high-performing regression analysis. These techniques can improve the quality of the data, thereby helping to improve the accuracy and efficiency of the subsequent knowledge extraction process. Analyzing data that has not been carefully screened for such problems can produce misleading results. For this reason, we have to get the data into a form that the algorithm can use to build a predictive analytical model. We started by discovering different ways to transform data, and the degree of cleaning the data. We analyzed the techniques available for the preparation of the most suitable data for analysis and modeling, which includes imputation of missing data, detecting and eliminating outliers, and adding derived variables.
Then we learned how to scale the data, in which data units are eliminated, allowing you to easily compare data from different locations. Data scaling is a preprocessing technique usually employed...