Book Image

Regression Analysis with Python

By : Luca Massaron, Alberto Boschetti
4 (1)
Book Image

Regression Analysis with Python

4 (1)
By: Luca Massaron, Alberto Boschetti

Overview of this book

Regression is the process of learning relationships between inputs and continuous outputs from example data, which enables predictions for novel inputs. There are many kinds of regression algorithms, and the aim of this book is to explain which is the right one to use for each set of problems and how to prepare real-world data for it. With this book you will learn to define a simple regression problem and evaluate its performance. The book will help you understand how to properly parse a dataset, clean it, and create an output matrix optimally built for regression. You will begin with a simple regression algorithm to solve some data science problems and then progress to more complex algorithms. The book will enable you to use regression models to predict outcomes and take critical business decisions. Through the book, you will gain knowledge to use Python for building fast better linear models and to apply the results in Python or in any computer language you prefer.
Table of Contents (16 chapters)
Regression Analysis with Python
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Least Angle Regression


Although very similar to Lasso (seen in Chapter 6, Achieving Generalization), Least Angle Regression, or simply LARS, is a regression algorithm that, in a fast and smart way, selects the best features to use in the model, even though they're very closely correlated to each other. LARS is an evolution of the Forward Selection (also called Forward Stepwise Regression) algorithm and of the Forward Stagewise Regression algorithm.

Here is how the Forward Selection algorithm works, based on the hypothesis that all the variables, including the target one, have been previously normalized:

  1. Of all the possible predictors for a problem, the one with the largest absolute correlation with the target variable y is selected (that is, the one with the most explanatory capability). Let's call it p1.

  2. All the other predictors are now projected onto p1 Least Angle Regression, and the projection is removed, creating a vector of residuals orthogonal to p1.

  3. Step 1 is repeated on the residual...