#### Overview of this book

Would you like to understand how and why machine learning techniques and data analytics are spearheading enterprises globally? From analyzing bioinformatics to predicting climate change, machine learning plays an increasingly pivotal role in our society. Although the real-world applications may seem complex, this book simplifies supervised learning for beginners with a step-by-step interactive approach. Working with real-time datasets, you’ll learn how supervised learning, when used with Python, can produce efficient predictive models. Starting with the fundamentals of supervised learning, you’ll quickly move to understand how to automate manual tasks and the process of assessing date using Jupyter and Python libraries like pandas. Next, you’ll use data exploration and visualization techniques to develop powerful supervised learning models, before understanding how to distinguish variables and represent their relationships using scatter plots, heatmaps, and box plots. After using regression and classification models on real-time datasets to predict future outcomes, you’ll grasp advanced ensemble techniques such as boosting and random forests. Finally, you’ll learn the importance of model evaluation in supervised learning and study metrics to evaluate regression and classification tasks. By the end of this book, you’ll have the skills you need to work on your real-life supervised learning Python projects.
Preface
1. Fundamentals
Free Chapter
2. Exploratory Data Analysis and Visualization
3. Linear Regression
4. Autoregression
5. Classification Techniques
6. Ensemble Modeling
7. Model Evaluation

# Linear Regression

We will start our investigation into regression models with the selection of a linear model. Linear models, while being a great first choice due to their intuitive nature, are also very powerful in their predictive power, assuming datasets contain some degree of linear or polynomial relationship between the input features and values. The intuitive nature of linear models often arises from the ability to view data as plotted on a graph and observe a trending pattern in the data with, say, the output (the y-axis value for the data) trending positively or negatively with the input (the x-axis value). The fundamental components of linear regression models are also often learned during high school mathematics classes. You may recall that the equation of a straight line is defined as follows:

Figure 3.13: Equation of a straight line

Here, x is the input value and y is the corresponding output or predicted value. The parameters of the model are the...