Python Machine Learning Cookbook - Second Edition

By : Giuseppe Ciaburro, Prateek Joshi

Python Machine Learning Cookbook - Second Edition

By: Giuseppe Ciaburro, Prateek Joshi

Overview of this book

This eagerly anticipated second edition of the popular Python Machine Learning Cookbook will enable you to adopt a fresh approach to dealing with real-world machine learning and deep learning tasks. With the help of over 100 recipes, you will learn to build powerful machine learning applications using modern libraries from the Python ecosystem. The book will also guide you on how to implement various machine learning algorithms for classification, clustering, and recommendation engines, using a recipe-based approach. With emphasis on practical solutions, dedicated sections in the book will help you to apply supervised and unsupervised learning techniques to real-world problems. Toward the concluding chapters, you will get to grips with recipes that teach you advanced techniques including reinforcement learning, deep neural networks, and automated machine learning. By the end of this book, you will be equipped with the skills you need to apply machine learning techniques and leverage the full capabilities of the Python ecosystem through real-world examples.

Preface

Who this book is for

What this book covers

To get the most out of this book

Sections

Get in touch

Free Chapter

The Realm of Supervised Learning

Technical requirements

Introduction

Array creation in Python

Data preprocessing using mean removal

Building a linear regressor

Computing regression accuracy

Achieving model persistence

Building a ridge regressor

Building a polynomial regressor

Estimating housing prices

Computing the relative importance of features

Estimating bicycle demand distribution

Constructing a Classifier

Technical requirements

Introduction

Building a simple classifier

Building a logistic regression classifier

Building a Naive Bayes classifier

Splitting a dataset for training and testing

Evaluating accuracy using cross-validation metrics

Visualizing a confusion matrix

Extracting a performance report

Evaluating cars based on their characteristics

Extracting validation curves

Extracting learning curves

Estimating the income bracket

Predicting the quality of wine

Predictive Modeling

Technical requirements

Introduction

Building a linear classifier using SVMs

Building a nonlinear classifier using SVMs

Tackling class imbalance

Extracting confidence measurements

Finding optimal hyperparameters

Building an event predictor

Estimating traffic

Simplifying machine learning workflow using TensorFlow

Implementing a stacking method

Clustering with Unsupervised Learning

Technical requirements

Introduction

Clustering data using the k-means algorithm

Compressing an image using vector quantization

Grouping data using agglomerative clustering

Evaluating the performance of clustering algorithms

Estimating the number of clusters using the DBSCAN algorithm

Finding patterns in stock market data

Building a customer segmentation model

Using autoencoders to reconstruct handwritten digit images

Visualizing Data

Technical requirements

An introduction to data visualization

Plotting three-dimensional scatter plots

Plotting bubble plots

Animating bubble plots

Drawing pie charts

Plotting date-formatted time series data

Plotting histograms

Visualizing heat maps

Animating dynamic signals

Working with the Seaborn library

Building Recommendation Engines

Technical requirements

Introducing the recommendation engine

Building function compositions for data processing

Building machine learning pipelines

Finding the nearest neighbors

Constructing a k-nearest neighbors classifier

Constructing a k-nearest neighbors regressor

Computing the Euclidean distance score

Computing the Pearson correlation score

Finding similar users in the dataset

Generating movie recommendations

Implementing ranking algorithms

Building a filtering model using TensorFlow

Analyzing Text Data

Technical requirements

Introduction

Preprocessing data using tokenization

Stemming text data

Converting text to its base form using lemmatization

Dividing text using chunking

Building a bag-of-words model

Building a text classifier

Identifying the gender of a name

Analyzing the sentiment of a sentence

Identifying patterns in text using topic modeling

Parts of speech tagging with spaCy

Word2Vec using gensim

Shallow learning for spam detection

Speech Recognition

Technical requirements

Introducing speech recognition

Reading and plotting audio data

Transforming audio signals into the frequency domain

Generating audio signals with custom parameters

Synthesizing music

Extracting frequency domain features

Building HMMs

Building a speech recognizer

Building a TTS system

Dissecting Time Series and Sequential Data

Technical requirements

Introducing time series

Transforming data into a time series format

Slicing time series data

Operating on time series data

Extracting statistics from time series data

Building HMMs for sequential data

Building CRFs for sequential text data

Analyzing stock market data

Using RNNs to predict time series data

Analyzing Image Content

Technical requirements

Introducing computer vision

Operating on images using OpenCV-Python

Detecting edges

Histogram equalization

Detecting corners

Detecting SIFT feature points

Building a Star feature detector

Creating features using Visual Codebook and vector quantization

Training an image classifier using Extremely Random Forests

Building an object recognizer

Using Light GBM for image classification

Biometric Face Recognition

Technical requirements

Introduction

Capturing and processing video from a webcam

Building a face detector using Haar cascades

Building eye and nose detectors

Performing principal component analysis

Performing kernel principal component analysis

Performing blind source separation

Building a face recognizer using a local binary patterns histogram

Recognizing faces using the HOG-based model

Facial landmark recognition

User authentication by face recognition

Reinforcement Learning Techniques

Technical requirements

Introduction

Weather forecasting with MDP

Optimizing a financial portfolio using DP

Finding the shortest path

Deciding the discount factor using Q-learning

Implementing the deep Q-learning algorithm

Developing an AI-based dynamic modeling system

Deep reinforcement learning with double Q-learning

Deep Q-network algorithm with dueling Q-learning

Deep Neural Networks

Technical requirements

Introduction

Building a perceptron

Building a single layer neural network

Building a deep neural network

Creating a vector quantizer

Building a recurrent neural network for sequential data analysis

Visualizing the characters in an OCR database

Building an optical character recognizer using neural networks

Implementing optimization algorithms in ANN

Unsupervised Representation Learning

Technical requirements

Introduction

Using denoising autoencoders to detect fraudulent transactions

Generating word embeddings using CBOW and skipgram representations

Visualizing the MNIST dataset using PCA and t-SNE

Using word embedding for Twitter sentiment analysis

Implementing LDA with scikit-learn

Using LDA to classify text documents

Preparing data for LDA

Automated Machine Learning and Transfer Learning

Technical requirements

Introduction

Working with Auto-WEKA

Using AutoML to generate machine learning pipelines with TPOT

Working with Auto-Keras

Working with auto-sklearn

Using MLBox for selection and leak detection

Convolutional neural networks with transfer learning

Transfer learning with pretrained image classifiers using ResNet-50

Transfer learning using feature extraction with the VGG16 model

Transfer learning with pretrained GloVe embedding

Unlocking Production Issues

Technical requirements

Introduction

Handling unstructured data

Deploying machine learning models

Keeping track of changes into production

Tracking accuracy to optimize model scaling

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

What this book covers

Chapter 1, The Realm of Supervised Learning, covers various machine learning paradigms that will help you to understand how the field is divided into multiple subgroups. This chapter briefly discuss the differences between supervised and unsupervised learning, along with the concepts of regression, classification, and clustering. We will learn how to preprocess data for machine learning. We will discuss regression analysis in detail and learn how to apply it to a couple of real-world problems, including house price estimation and bicycle demand distribution.

Chapter 2, Constructing a Classifier, shows you how to perform data classification using various models. We will discuss techniques including logistic regression and the naïve Bayes model. We will learn how to evaluate the accuracy of classification algorithms. We will discuss the concept of cross-validation and learn how to use it to validate our machine learning model. We will learn about validation curves and how to plot them. We will apply these supervised learning techniques to real-world problems, such as income bracket estimation and activity recognition.

Chapter 3, Predictive Modeling, covers the premise of predictive modeling and why it’s needed. We will learn about SVMs and understand how they work. We will learn how to use them to classify data. We will discuss the concept of hyperparameters and how they affect the performance of SVMs. We will learn how to use grid search to find the optimal set of hyperparameters. We will discuss how to estimate the confidence measure of the outputs. We will talk about ensemble learning and the various algorithms in this group, such as decision trees and random forests. We will then learn how to apply these techniques to real-world event prediction.

Chapter 4, Clustering with Unsupervised Learning, covers the concept of unsupervised learning and what we hope to achieve from it. We will learn how to perform data clustering and how to apply the k-means clustering algorithm to do it. We will visualize the clustering process using sample data. We will discuss mixture models and Gaussian mixture models. We will then apply these techniques to perform market segmentation using customer information.

Chapter 5, Visualizing Data, discusses how to visualize data and explains why it's useful for machine learning. We will learn how to use Matplotlib to interact with our data and visualize it using various techniques. We will discuss histograms and how they are useful. We will explore different methods for visualizing data, including line charts, scatter plots, and bubble plots. We will learn how to use heat maps, perform animation, and do 3D plotting.

Chapter 6, Building Recommendation Engines, introduces recommendation engines and shows us how to use it for checking movie recommendations. We will construct a k-nearest neighbors classifier to find similar users in our dataset and then generate movie recommendations using a filtering model with TensorFlow.

Chapter 7, Analyzing Text Data, shows you how to analyze text data. We will understand various concepts such as the bag-of-words model, tokenization, and stemming. We will learn about the features that can be extracted from text. We will discuss how to build a text classifier. We will then use these techniques to infer the sentiment of a sentence. We will also learn how to automatically identify the topic of an unknown paragraph. We will then move on to evaluating regression and classification models, and then step into recipes that can help us with selecting models.

Chapter 8, Speech Recognition, demonstrates how you can work with speech data. We will learn about concepts including windowing and convolution. We will understand how to extract features from speech data. We will learn about hidden Markov models and how to use them to automatically recognize the words being spoken.

Chapter 9, Dissecting Time Series and Sequential Data, introduces the concept of structured learning. We will come to understand the various characteristics of time series data. We will learn about conditional random fields and see how to use them for prediction. We will then use this technique to analyze stock market data.

Chapter 10, Image Content Analysis, shows how to analyze images. We will learn how to detect keypoints and extract features from images. We will discuss the concept of a bag of visual words and see how it applies to image classification. We will learn how to build a visual code book and extract feature vectors for image classification. We will then understand how to use extremely random forests to perform object recognition.

Chapter 11, Biometric Face Recognition, shows how to perform face recognition. We will understand the differences between face detection and face recognition. We will talk about dimensionality reduction and how to use PCA to achieve this. We will learn about Fisher Faces and how it can be used for face recognition. We will perform face detection on a live video. We will then use these techniques to identify the person in front of the camera.

Chapter 12, Reinforcement Learning Techniques, discusses reinforcement learning techniques and its applications. It also discusses the elements of reinforcement learning setup, approaches to reinforcement learning, and its challenges, along with topics such as Markov decision processes, the exploration-exploitation dilemma, discounted future reward, and Q-learning.

Chapter 13, Deep Neural Networks, discusses deep neural networks. We will learn about perceptrons and see how they are used to build neural networks. We will explore the interconnections between multiple layers in a deep neural network. We will discuss how a neural network learns about the training data and builds a model. We will learn about the cost function and backpropagation. We will then use these techniques to perform optical character recognition. We will work with different frameworks for deep learning, including TensorFlow, PyTorch, and Caffe.

Chapter 14, Unsupervised Representation Learning, discusses the problem of learning representations for data such as images, videos, and natural language corpuses in an unsupervised manner. We will go through autoencoders and their applications, word embeddings, and t-SNEs. We will also use denoising autoencoders to detect fraudulent transactions using word embeddings. Lastly, we will move on to implementing LDA with the help of various recipes.

Chapter 15, Automated Machine Learning and Transfer Learning, discusses recipes based on automated machine learning and transfer learning. We will learn how to work with Auto-WEKA and how to use AutoML to generate machine learning pipelines. We will learn how to work with Auto-Keras and then move on to using MLBox for leak detection. Furthermore, we will learn how to implement transfer learning with the help of multiple recipes.

Chapter 16, Unlocking Production Issues, discusses production-related issues. We will go through how a reader can handle unstructured data, along with how we can keep a track of changes in our machine learning models. We will also learn how to optimize a retraining schedule and how to deploy our machine learning models.

Python Machine Learning Cookbook - Second Edition

By : Giuseppe Ciaburro, Prateek Joshi

Python Machine Learning Cookbook - Second Edition

By: Giuseppe Ciaburro, Prateek Joshi

Overview of this book

Related Content you might be interested in

Current Title:

Python Machine Learning Cookbook - Second Edition

Artificial Intelligence with Python

Artificial Intelligence with Python

Keras Reinforcement Learning Projects