Applied Supervised Learning with Python

Applied Supervised Learning with Python

By : Benjamin Johnston, Ishita Mathur

Buy this Book

Applied Supervised Learning with Python

By: Benjamin Johnston, Ishita Mathur

Buy this Book

Overview of this book

Machine learning—the ability of a machine to give right answers based on input data—has revolutionized the way we do business. Applied Supervised Learning with Python provides a rich understanding of how you can apply machine learning techniques in your data science projects using Python. You'll explore Jupyter Notebooks, the technology used commonly in academic and commercial circles with in-line code running support. With the help of fun examples, you'll gain experience working on the Python machine learning toolkit—from performing basic data cleaning and processing to working with a range of regression and classification algorithms. Once you’ve grasped the basics, you'll learn how to build and train your own models using advanced techniques such as decision trees, ensemble modeling, validation, and error metrics. You'll also learn data visualization techniques using powerful Python libraries such as Matplotlib and Seaborn. This book also covers ensemble modeling and random forest classifiers along with other methods for combining results from multiple models, and concludes by delving into cross-validation to test your algorithm and check how well the model works on unseen data. By the end of this book, you'll be equipped to not only work with machine learning algorithms, but also be able to create some of your own!

Applied Supervised Learning with Python

Preface

Free Chapter

Python Machine Learning Toolkit

Introduction

Supervised Machine Learning

Jupyter Notebooks

pandas

Data Quality Considerations

Summary

Exploratory Data Analysis and Visualization

Introduction

Summary Statistics and Central Values

Missing Values

Distribution of Values

Relationships within the Data

Summary

Regression Analysis

Introduction

Regression and Classification Problems

Linear Regression

Multiple Linear Regression

Autoregression Models

Summary

Classification

Introduction

Linear Regression as a Classifier

Logistic Regression

Classification Using K-Nearest Neighbors

Classification Using Decision Trees

Summary

Ensemble Modeling

Introduction

Overfitting and Underfitting

Bagging

Boosting

Summary

Model Evaluation

Introduction

Evaluation Metrics

Splitting the Dataset

Performance Improvement Tactics

Summary

Appendix

Chapter 1: Python Machine Learning Toolkit

Chapter 2: Exploratory Data Analysis and Visualization

Chapter 3: Regression Analysis

Chapter 4: Classification

Chapter 5: Ensemble Modeling

Chapter 6: Model Evaluation

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Preface

Note

About

This section briefly introduces the authors, what this book covers, the technical skills you'll need to get started, and the hardware and software requirements required to complete all of the included activities and exercises.

About the Book

Machine learning—the ability of a machine to give correct answers based on input data—has revolutionized the way we do business. Applied Supervised Learning with Python provides a rich understanding of how you can apply machine learning techniques to your data science projects using Python. You'll explore Jupyter notebooks, a technology that's widely used in academic and commercial circles with support for running inline code.

With the help of fun examples, you'll gain experience working on the Python machine learning toolkit—from performing basic data cleaning and processing to working with a range of regression and classification algorithms. Once you've grasped the basics, you'll learn how to build and train your own models using advanced techniques such as decision trees, ensemble modeling, validation, and error metrics. You'll also learn data visualization techniques using powerful Python libraries such as Matplotlib and Seaborn.

This book also covers ensemble modeling and random forest classifiers, along with other methods for combining results from multiple models, and concludes by delving into cross-validation to test your algorithm and check how well the model works on unseen data.

By the end of this book, you'll be equipped to not only work with machine learning algorithms, but also be able to create some of your own!

About the Authors

Benjamin Johnston is a senior data scientist for one of the world's leading data-driven medtech companies and is involved in the development of innovative digital solutions throughout the entire product development pathway, from problem definition, to solution research and development, through to final deployment. He is currently completing his PhD in machine learning, specializing in image processing and deep convolutional neural networks. He has more than 10 years' experience in medical device design and development, working in a variety of technical roles and holds first-class honors bachelor's degrees in both engineering and medical science from the University of Sydney, Australia.

Ishita Mathur has worked as a data scientist for 2.5 years with product-based start-ups working with business concerns in various domains and formulating them as technical problems that can be solved using data and machine learning. Her current work at GO-JEK involves the end-to-end development of machine learning projects, by working as part of a product team on defining, prototyping, and implementing data science models within the product. She completed her masters' degree in high-performance computing with data science at the University of Edinburgh, UK, and her bachelor's degree with honors in physics at St. Stephen's College, Delhi.

Objectives

Understand the concept of supervised learning and its applications
Implement common supervised learning algorithms using machine learning Python libraries
Validate models using the k-fold technique
Build your models with decision trees to get results effortlessly
Use ensemble modeling techniques to improve the performance of your model
Apply a variety of metrics to compare machine learning models

Audience

Applied Supervised Learning with Python is for you if you want to gain a solid understanding of machine learning using Python. It'll help if you have some experience in any functional or object-oriented language and a basic understanding of Python libraries and expressions, such as arrays and dictionaries.

Approach

Applied Supervised Learning with Python takes a hands-on approach toward understanding supervised learning with Python. It contains multiple activities that use real-life business scenarios for you to practice and apply your new skills in a highly relevant context.

Hardware Requirements

For an optimal student experience, we recommend the following hardware configuration:

Processor: Dual Core or better
Memory: 4 GB RAM
Hard disk: 10 GB available space
Internet connection

Software Requirements

You'll also need the following software installed in advance:

Any of the following operating systems:
Windows 7 SP1 32/64-bit, Windows 8.1 32/64-bit, or Windows 10 32/64-bit
Ubuntu 14.04 or later
macOS Sierra or later
Browser: Google Chrome or Mozilla Firefox
Anaconda

Conventions

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "This can be easily verified using Python's built-in type function."

A block of code is set as follows:

description_features = [
    'injuries_description', 'damage_description',
    'total_injuries_description', 'total_damage_description'
]

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Click on the Untitled text and a popup will appear allowing you to rename the notebook."

Installation and Setup

Jupyter notebooks are available once you install Anaconda on your system. Anaconda can be installed for Windows systems using the steps available at https://docs.anaconda.com/anaconda/install/windows/.

For other systems, navigate to the respective installation guide from https://docs.anaconda.com/anaconda/install/.

Installing the Code Bundle

Copy the code bundle for the book to the C:/Code folder.

Additional Resources

The code bundle for this book is also hosted on GitHub at: https://github.com/TrainingByPackt/Applied-Supervised-Learning-with-Python.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Applied Supervised Learning with Python

By : Benjamin Johnston, Ishita Mathur

Applied Supervised Learning with Python

By: Benjamin Johnston, Ishita Mathur

Overview of this book

Related Content you might be interested in

Current Title:

Applied Supervised Learning with Python

Data Science for Marketing Analytics

Ensemble Machine Learning Cookbook

Machine Learning with scikit-learn Quick Start Guide

Preface

Note

About the Book

About the Authors

Objectives

Audience

Approach

Hardware Requirements

Software Requirements

Conventions

Installation and Setup

Installing the Code Bundle

Additional Resources