Python Machine Learning, Second Edition

Python Machine Learning, Second Edition - Second Edition

By : Sebastian Raschka, Vahid Mirjalili

Buy this Book

Python Machine Learning, Second Edition - Second Edition

By: Sebastian Raschka, Vahid Mirjalili

Buy this Book

Overview of this book

Publisher's Note: This edition from 2017 is outdated and is not compatible with TensorFlow 2 or any of the most recent updates to Python libraries. A new third edition, updated for 2020 and featuring TensorFlow 2 and the latest in scikit-learn, reinforcement learning, and GANs, has now been published. Machine learning is eating the software world, and now deep learning is extending machine learning. Understand and work at the cutting edge of machine learning, neural networks, and deep learning with this second edition of Sebastian Raschka’s bestselling book, Python Machine Learning. Using Python's open source libraries, this book offers the practical knowledge and techniques you need to create and contribute to machine learning, deep learning, and modern data analysis. Fully extended and modernized, Python Machine Learning Second Edition now includes the popular TensorFlow 1.x deep learning library. The scikit-learn code has also been fully updated to v0.18.1 to include improvements and additions to this versatile machine learning library. Sebastian Raschka and Vahid Mirjalili’s unique insight and expertise introduce you to machine learning and deep learning algorithms from scratch, and show you how to apply them to practical industry challenges using realistic and interesting examples. By the end of the book, you’ll be ready to meet the new data analysis opportunities. If you’ve read the first edition of this book, you’ll be delighted to find a balance of classical ideas and modern insights into machine learning. Every chapter has been critically updated, and there are new chapters on key technologies. You’ll be able to learn and work with TensorFlow 1.x more deeply than ever before, and get essential coverage of the Keras neural network library, along with updates to scikit-learn 0.18.1.

Python Machine Learning Second Edition

Credits

About the Authors

About the Reviewers

www.PacktPub.com

Packt is Searching for Authors Like You

Preface

Free Chapter

Giving Computers the Ability to Learn from Data

Building intelligent machines to transform data into knowledge

The three different types of machine learning

Introduction to the basic terminology and notations

A roadmap for building machine learning systems

Using Python for machine learning

Summary

Training Simple Machine Learning Algorithms for Classification

Artificial neurons – a brief glimpse into the early history of machine learning

Implementing a perceptron learning algorithm in Python

Adaptive linear neurons and the convergence of learning

Summary

A Tour of Machine Learning Classifiers Using scikit-learn

Choosing a classification algorithm

First steps with scikit-learn – training a perceptron

Modeling class probabilities via logistic regression

Maximum margin classification with support vector machines

Solving nonlinear problems using a kernel SVM

Decision tree learning

K-nearest neighbors – a lazy learning algorithm

Summary

Building Good Training Sets – Data Preprocessing

Dealing with missing data

Handling categorical data

Partitioning a dataset into separate training and test sets

Bringing features onto the same scale

Selecting meaningful features

Assessing feature importance with random forests

Summary

Compressing Data via Dimensionality Reduction

Unsupervised dimensionality reduction via principal component analysis

Supervised data compression via linear discriminant analysis

Using kernel principal component analysis for nonlinear mappings

Summary

Learning Best Practices for Model Evaluation and Hyperparameter Tuning

Streamlining workflows with pipelines

Using k-fold cross-validation to assess model performance

Debugging algorithms with learning and validation curves

Fine-tuning machine learning models via grid search

Looking at different performance evaluation metrics

Dealing with class imbalance

Summary

Combining Different Models for Ensemble Learning

Learning with ensembles

Combining classifiers via majority vote

Bagging – building an ensemble of classifiers from bootstrap samples

Leveraging weak learners via adaptive boosting

Summary

Applying Machine Learning to Sentiment Analysis

Preparing the IMDb movie review data for text processing

Introducing the bag-of-words model

Training a logistic regression model for document classification

Working with bigger data – online algorithms and out-of-core learning

Topic modeling with Latent Dirichlet Allocation

Summary

Embedding a Machine Learning Model into a Web Application

Serializing fitted scikit-learn estimators

Setting up an SQLite database for data storage

Developing a web application with Flask

Turning the movie review classifier into a web application

Deploying the web application to a public server

Summary

Predicting Continuous Target Variables with Regression Analysis

Introducing linear regression

Exploring the Housing dataset

Implementing an ordinary least squares linear regression model

Fitting a robust regression model using RANSAC

Evaluating the performance of linear regression models

Using regularized methods for regression

Turning a linear regression model into a curve – polynomial regression

Dealing with nonlinear relationships using random forests

Summary

Working with Unlabeled Data – Clustering Analysis

Grouping objects by similarity using k-means

Organizing clusters as a hierarchical tree

Locating regions of high density via DBSCAN

Summary

Implementing a Multilayer Artificial Neural Network from Scratch

Modeling complex functions with artificial neural networks

Classifying handwritten digits

Training an artificial neural network

About the convergence in neural networks

A few last words about the neural network implementation

Summary

Parallelizing Neural Network Training with TensorFlow

TensorFlow and training performance

Training neural networks efficiently with high-level TensorFlow APIs

Choosing activation functions for multilayer networks

Summary

Going Deeper – The Mechanics of TensorFlow

Key features of TensorFlow

TensorFlow ranks and tensors

Understanding TensorFlow's computation graphs

Placeholders in TensorFlow

Variables in TensorFlow

Building a regression model

Executing objects in a TensorFlow graph using their names

Saving and restoring a model in TensorFlow

Transforming Tensors as multidimensional data arrays

Utilizing control flow mechanics in building graphs

Visualizing the graph with TensorBoard

Summary

Classifying Images with Deep Convolutional Neural Networks

Building blocks of convolutional neural networks

Putting everything together to build a CNN

Implementing a deep convolutional neural network using TensorFlow

Summary

Modeling Sequential Data Using Recurrent Neural Networks

Introducing sequential data

RNNs for modeling sequences

Implementing a multilayer RNN for sequence modeling in TensorFlow

Project one – performing sentiment analysis of IMDb movie reviews using multilayer RNNs

Project two – implementing an RNN for character-level language modeling in TensorFlow

Chapter and book summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

The three different types of machine learning

In this section, we will take a look at the three types of machine learning: supervised learning, unsupervised learning, and reinforcement learning. We will learn about the fundamental differences between the three different learning types and, using conceptual examples, we will develop an intuition for the practical problem domains where these can be applied:

Making predictions about the future with supervised learning

The main goal in supervised learning is to learn a model from labeled training data that allows us to make predictions about unseen or future data. Here, the term supervised refers to a set of samples where the desired output signals (labels) are already known.

Considering the example of email spam filtering, we can train a model using a supervised machine learning algorithm on a corpus of labeled emails, emails that are correctly marked as spam or not-spam, to predict whether a new email belongs to either of the two categories. A supervised learning task with discrete class labels, such as in the previous email spam filtering example, is also called a classification task. Another subcategory of supervised learning is regression, where the outcome signal is a continuous value:

Classification for predicting class labels

Classification is a subcategory of supervised learning where the goal is to predict the categorical class labels of new instances, based on past observations. Those class labels are discrete, unordered values that can be understood as the group memberships of the instances. The previously mentioned example of email spam detection represents a typical example of a binary classification task, where the machine learning algorithm learns a set of rules in order to distinguish between two possible classes: spam and non-spam emails.

However, the set of class labels does not have to be of a binary nature. The predictive model learned by a supervised learning algorithm can assign any class label that was presented in the training dataset to a new, unlabeled instance. A typical example of a multiclass classification task is handwritten character recognition. Here, we could collect a training dataset that consists of multiple handwritten examples of each letter in the alphabet. Now, if a user provides a new handwritten character via an input device, our predictive model will be able to predict the correct letter in the alphabet with certain accuracy. However, our machine learning system would be unable to correctly recognize any of the digits zero to nine, for example, if they were not part of our training dataset.

The following figure illustrates the concept of a binary classification task given 30 training samples; 15 training samples are labeled as negative class (minus signs) and 15 training samples are labeled as positive class (plus signs). In this scenario, our dataset is two-dimensional, which means that each sample has two values associated with it: and . Now, we can use a supervised machine learning algorithm to learn a rule—the decision boundary represented as a dashed line—that can separate those two classes and classify new data into each of those two categories given its and values:

Regression for predicting continuous outcomes

We learned in the previous section that the task of classification is to assign categorical, unordered labels to instances. A second type of supervised learning is the prediction of continuous outcomes, which is also called regression analysis. In regression analysis, we are given a number of predictor (explanatory) variables and a continuous response variable (outcome or target), and we try to find a relationship between those variables that allows us to predict an outcome.

For example, let's assume that we are interested in predicting the math SAT scores of our students. If there is a relationship between the time spent studying for the test and the final scores, we could use it as training data to learn a model that uses the study time to predict the test scores of future students who are planning to take this test.

Note

The term regression was devised by Francis Galton in his article Regression towards Mediocrity in Hereditary Stature in 1886. Galton described the biological phenomenon that the variance of height in a population does not increase over time. He observed that the height of parents is not passed on to their children, but instead the children's height is regressing towards the population mean.

The following figure illustrates the concept of linear regression. Given a predictor variable x and a response variable y, we fit a straight line to this data that minimizes the distance—most commonly the average squared distance—between the sample points and the fitted line. We can now use the intercept and slope learned from this data to predict the outcome variable of new data:

Solving interactive problems with reinforcement learning

Another type of machine learning is reinforcement learning. In reinforcement learning, the goal is to develop a system (agent) that improves its performance based on interactions with the environment. Since the information about the current state of the environment typically also includes a so-called reward signal, we can think of reinforcement learning as a field related to supervised learning. However, in reinforcement learning this feedback is not the correct ground truth label or value, but a measure of how well the action was measured by a reward function. Through its interaction with the environment, an agent can then use reinforcement learning to learn a series of actions that maximizes this reward via an exploratory trial-and-error approach or deliberative planning.

A popular example of reinforcement learning is a chess engine. Here, the agent decides upon a series of moves depending on the state of the board (the environment), and the reward can be defined as win or lose at the end of the game:

There are many different subtypes of reinforcement learning. However, a general scheme is that the agent in reinforcement learning tries to maximize the reward by a series of interactions with the environment. Each state can be associated with a positive or negative reward, and a reward can be defined as accomplishing an overall goal, such as winning or losing a game of chess. For instance, in chess the outcome of each move can be thought of as a different state of the environment. To explore the chess example further, let's think of visiting certain locations on the chess board as being associated with a positive event—for instance, removing an opponent's chess piece from the board or threatening the queen. Other positions, however, are associated with a negative event, such as losing a chess piece to the opponent in the following turn. Now, not every turn results in the removal of a chess piece, and reinforcement learning is concerned with learning the series of steps by maximizing a reward based on immediate and delayed feedback.

While this section provides a basic overview of reinforcement learning, please note that applications of reinforcement learning are beyond the scope of this book, which primarily focusses on classification, regression analysis, and clustering.

Discovering hidden structures with unsupervised learning

In supervised learning, we know the right answer beforehand when we train our model, and in reinforcement learning, we define a measure of reward for particular actions by the agent. In unsupervised learning, however, we are dealing with unlabeled data or data of unknown structure. Using unsupervised learning techniques, we are able to explore the structure of our data to extract meaningful information without the guidance of a known outcome variable or reward function.

Finding subgroups with clustering

Clustering is an exploratory data analysis technique that allows us to organize a pile of information into meaningful subgroups (clusters) without having any prior knowledge of their group memberships. Each cluster that arises during the analysis defines a group of objects that share a certain degree of similarity but are more dissimilar to objects in other clusters, which is why clustering is also sometimes called unsupervised classification. Clustering is a great technique for structuring information and deriving meaningful relationships from data. For example, it allows marketers to discover customer groups based on their interests, in order to develop distinct marketing programs.

The following figure illustrates how clustering can be applied to organizing unlabeled data into three distinct groups based on the similarity of their features and :

Dimensionality reduction for data compression

Another subfield of unsupervised learning is dimensionality reduction. Often we are working with data of high dimensionality—each observation comes with a high number of measurements—that can present a challenge for limited storage space and the computational performance of machine learning algorithms. Unsupervised dimensionality reduction is a commonly used approach in feature preprocessing to remove noise from data, which can also degrade the predictive performance of certain algorithms, and compress the data onto a smaller dimensional subspace while retaining most of the relevant information.

Sometimes, dimensionality reduction can also be useful for visualizing data, for example, a high-dimensional feature set can be projected onto one-, two-, or three-dimensional feature spaces in order to visualize it via 3D or 2D scatterplots or histograms. The following figure shows an example where nonlinear dimensionality reduction was applied to compress a 3D Swiss Roll onto a new 2D feature subspace:

Python Machine Learning, Second Edition - Second Edition

By : Sebastian Raschka, Vahid Mirjalili

Python Machine Learning, Second Edition - Second Edition

By: Sebastian Raschka, Vahid Mirjalili

Overview of this book

Related Content you might be interested in

Current Title:

Python Machine Learning, Second Edition - Second Edition

Mastering Machine Learning with scikit-learn

Machine Learning for OpenCV

Machine Learning for OpenCV 4.

The three different types of machine learning

Making predictions about the future with supervised learning

Classification for predicting class labels

Regression for predicting continuous outcomes

Note

Solving interactive problems with reinforcement learning

Discovering hidden structures with unsupervised learning

Finding subgroups with clustering

Dimensionality reduction for data compression