Robust regression is designed to deal better with outliers in data than ordinary regression. This type of regression uses special robust estimators, which are also supported by statsmodels. Obviously, there is no best estimator, so the choice of estimator depends on the data and the model.
In this recipe, we will fit data about annual sunspot counts available in statsmodels. We will define a simple model where the current count depends linearly on the previous value. To demonstrate the effect of outliers, I added a pretty big value and we will compare the robust regression model and an ordinary least squares model.
The following steps describe how to apply the robust linear model:
The imports are as follows:
import statsmodels.api as sm import matplotlib.pyplot as plt import dautil as dl from IPython.display import HTML
Define the following function to set the labels of the plots:
def set_labels(ax): ax.set_xlabel('Year') ax.set_ylabel('Sunactivity...