Book Image

Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits

By : Tarek Amr
Book Image

Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits

By: Tarek Amr

Overview of this book

Machine learning is applied everywhere, from business to research and academia, while scikit-learn is a versatile library that is popular among machine learning practitioners. This book serves as a practical guide for anyone looking to provide hands-on machine learning solutions with scikit-learn and Python toolkits. The book begins with an explanation of machine learning concepts and fundamentals, and strikes a balance between theoretical concepts and their applications. Each chapter covers a different set of algorithms, and shows you how to use them to solve real-life problems. You’ll also learn about various key supervised and unsupervised machine learning algorithms using practical examples. Whether it is an instance-based learning algorithm, Bayesian estimation, a deep neural network, a tree-based ensemble, or a recommendation system, you’ll gain a thorough understanding of its theory and learn when to apply it. As you advance, you’ll learn how to deal with unlabeled data and when to use different clustering and anomaly detection algorithms. By the end of this machine learning book, you’ll have learned how to take a data-driven approach to provide end-to-end machine learning solutions. You’ll also have discovered how to formulate the problem at hand, prepare required data, and evaluate and deploy models in production.
Table of Contents (18 chapters)
1
Section 1: Supervised Learning
8
Section 2: Advanced Supervised Learning
13
Section 3: Unsupervised Learning and More

Installing the imbalanced-learn library

Due to class imbalance, we will need to resample our training data or apply different techniques to get better classification results. Thus, we are going to rely on theimbalanced-learnlibrary here. The project was started in 2014 by Fernando Nogueira. It now offers multiple resampling data techniques, as well as metrics for evaluating imbalanced classification problems. The library's interface is compatible with scikit-learn.

You can download the library via pip by running the following command in your Terminal:

          pip install -U imbalanced-learn
        

Now, you can import and use its different modules in your code, as we will see in the following sections. One of the metrics provided by the library is the geometric mean score. InChapter 8, Ensembles – When One Model is Not Enough, we learned about the true positive rate(TPR),or sensitivity, and the false positive rate (FPR), and we used them...