We expect, or at least hope, that the residuals of regression are just random noise. If that is not the case, then our regressor may be ignoring information. We expect the residuals to be independent and normally distributed. It is relatively easy to check with a histogram or a QQ plot. In general, we want the mean of the residuals to be as close to zero as possible, and we want the variance of the residuals to be as small as possible. An ideal fit will have zero-valued residuals.
The imports are as follows:
import numpy as np import matplotlib.pyplot as plt import dautil as dl import seaborn as sns from scipy.stats import probplot from IPython.display import HTML
Load the target and predictions for the boosting regressor:
y_test = np.load('temp_y_test.npy') preds = np.load('boosting.npy')
Plot the actual and predicted values as follows:
sp = dl.plotting.Subplotter(2, 2, context) cp = dl.plotting.CyclePlotter(sp.ax) cp.plot(y_test) cp.plot(preds)...