# Regularization

When working with regressions, we may look to add a penalty term to our regression equation to reduce overfitting by punishing certain decisions for coefficients made by the model; this is called **regularization**. We are looking for the coefficients that will minimize this penalty term. The idea is to shrink the coefficients toward zero for features that don't contribute much to reducing the error of the model. Some common techniques are ridge regression, LASSO (short for *Least Absolute Shrinkage and Selection Operator*) regression, and elastic net regression, which combines the LASSO and ridge penalty terms. Note that since these techniques rely on the magnitude of the coefficients, the data should be scaled beforehand.

**Ridge regression**, also called **L2 regularization**, punishes high coefficients () by adding the sum of the squares of the coefficients to the cost function (which regression looks to minimize when fitting), as per the following penalty term: