Book Image

Applied Supervised Learning with Python

By : Benjamin Johnston, Ishita Mathur
Book Image

Applied Supervised Learning with Python

By: Benjamin Johnston, Ishita Mathur

Overview of this book

Machine learning—the ability of a machine to give right answers based on input data—has revolutionized the way we do business. Applied Supervised Learning with Python provides a rich understanding of how you can apply machine learning techniques in your data science projects using Python. You'll explore Jupyter Notebooks, the technology used commonly in academic and commercial circles with in-line code running support. With the help of fun examples, you'll gain experience working on the Python machine learning toolkit—from performing basic data cleaning and processing to working with a range of regression and classification algorithms. Once you’ve grasped the basics, you'll learn how to build and train your own models using advanced techniques such as decision trees, ensemble modeling, validation, and error metrics. You'll also learn data visualization techniques using powerful Python libraries such as Matplotlib and Seaborn. This book also covers ensemble modeling and random forest classifiers along with other methods for combining results from multiple models, and concludes by delving into cross-validation to test your algorithm and check how well the model works on unseen data. By the end of this book, you'll be equipped to not only work with machine learning algorithms, but also be able to create some of your own!
Table of Contents (9 chapters)

Linear Regression as a Classifier


We covered linear regression in the context of predicting continuous variable output in the previous chapter, but it can also be used to predict the class that a set of data is a member of. Linear regression classifiers are not as powerful as other types of classifiers that we will cover in this chapter, but they are particularly useful in understanding the process of classification. Let's say we had a fictional dataset containing two separate groups, Xs and Os, as shown in Figure 4.1. We could construct a linear classifier by first using linear regression to fit the equation of a straight line to the dataset. For any value that lies above the line, the X class would be predicted, and for any value beneath the line, the O class would be predicted. Any dataset that can be separated by a straight line is known as linearly separable, which forms an important subset of data types in machine learning problems. While this may not be particularly helpful in the...