-
Book Overview & Buying
-
Table Of Contents
Python Feature Engineering Cookbook - Third Edition
By :
Missing data—meaning the absence of values for certain observations—is an unavoidable problem in most data sources. Some machine learning model implementations can handle missing data out of the box. To train other models, we must remove observations with missing data or transform them into permitted values.
The act of replacing missing data with their statistical estimates is called imputation. The goal of any imputation technique is to produce a complete dataset. There are multiple imputation methods. We select which one to use, depending on whether the data is missing at random, the proportion of missing values, and the machine learning model we intend to use. In this chapter, we will discuss several imputation methods.
This chapter will cover the following recipes: