#### Overview of this book

Machine learning—the ability of a machine to give right answers based on input data—has revolutionized the way we do business. Applied Supervised Learning with Python provides a rich understanding of how you can apply machine learning techniques in your data science projects using Python. You'll explore Jupyter Notebooks, the technology used commonly in academic and commercial circles with in-line code running support. With the help of fun examples, you'll gain experience working on the Python machine learning toolkit—from performing basic data cleaning and processing to working with a range of regression and classification algorithms. Once you’ve grasped the basics, you'll learn how to build and train your own models using advanced techniques such as decision trees, ensemble modeling, validation, and error metrics. You'll also learn data visualization techniques using powerful Python libraries such as Matplotlib and Seaborn. This book also covers ensemble modeling and random forest classifiers along with other methods for combining results from multiple models, and concludes by delving into cross-validation to test your algorithm and check how well the model works on unseen data. By the end of this book, you'll be equipped to not only work with machine learning algorithms, but also be able to create some of your own!
Applied Supervised Learning with Python
Preface
Free Chapter
Python Machine Learning Toolkit
Exploratory Data Analysis and Visualization
Regression Analysis
Classification
Ensemble Modeling
Model Evaluation

## Linear Regression

We will start our investigation into regression problems with the selection of a linear model. Linear models, while being a great first choice due to their intuitive nature, are also very powerful in their predictive power, assuming datasets contain some degree of linear or polynomial relationship between the input features and values. The intuitive nature of linear models often arises from the ability to view data as plotted on a graph and observe a trending pattern in the data with, say, the output (the y axis value for the data) trending positively or negatively with the input (x axis value). While often not presented as such, the fundamental components of linear regression models are also often learned during high school mathematics classes. You may recall that the equation of a straight line, or linear model, is defined as follows:

Figure 3.1: Equation of a straight line

Here, x is the input value and y is the corresponding output or predicted value. The parameters of...