## Univariate linear regression

We begin by looking at a simple way to predict a quantitative response, *Y*, with one predictor variable, *x*, assuming that *Y* has a linear relationship with *x*. The model for this can be written as, *Y = B0 + B1x + e*. We can state it as the expected value of *Y* being a function of the parameters *B0* (the intercept) plus *B1* (the slope) times *x*, plus an error term *e*. The least squares approach chooses the model parameters that minimize the **Residual Sum of Squares** (**RSS**) of the predicted *y* values versus the actual *Y* values. For a simple example, let's say we have the actual values of *Y1* and *Y2* equal to *10* and *20* respectively, along with the predictions of *y1* and *y2* as *12* and *18*. To calculate RSS, we add the squared differences *RSS = (Y1 - y1) ^{2} + (Y2 - y2)^{2}*, which, with simple substitution, yields

*(10 - 12)*.

^{2}+ (20 - 18)^{2}= 8I once remarked to a peer during our Lean Six Sigma Black Belt training that it's all about the sum of squares; understand the sum of squares and the...