Customizing scikit-learn transformers
Now that we have a process for transforming the DataFrame into a machine learning-ready sparse matrix, it would be advantageous to generalize the process with transformers so that it can easily be repeated for new data coming in.
Scikit-learn transformers work with machine learning algorithms by using a
fit method, which finds model parameters, and a
transform method, which applies these parameters to data. These methods may be combined into a single
fit_transform method that fits and transforms data in one line of code.
When used together, various transformers, including machine learning algorithms, may work together in the same pipeline for ease of use. Data is then placed in the pipeline that is fit and transformed to achieve the desired output.
Scikit-learn comes with many great transformers, such as
Normalizer to standardize and normalize data, respectively, and
SimpleImputer to convert null values. You have to...